Closed austvik closed 11 years ago
Hi,
Thank you for the report.
I profiled the code and found that the bignum module was making the script too slow.
Please download the new version, it should fix the problem.
Nelson
2013/9/23 Jørgen Austvik notifications@github.com
Hi,
thanks for a great tool - st is exactly what I need, except for the speed.
Up to 1000 lines of numbers in a file works ok:
$ time head -n 10 jmeter_saso.log | cut -d, -f2 | st N min max sum mean stddev 9.00 7578.00 19843.00 132073.00 14674.78 3632.56
real 0m0.207s user 0m0.170s sys 0m0.010s
$ time head -n 100 jmeter_saso.log | cut -d, -f2 | st N min max sum mean stddev 99.00 7578.00 35999.00 2372769.00 23967.36 5713.40
real 0m0.339s user 0m0.300s sys 0m0.020s
$ time head -n 1000 jmeter_saso.log | cut -d, -f2 | st N min max sum mean stddev 999.00 80.00 38075.00 7644960.00 7652.61 10007.16
real 0m2.375s user 0m2.280s sys 0m0.030s
But at 10.000 lines it starts getting really slow:
$ time head -n 10000 jmeter_saso.log | cut -d, -f2 | st N min max sum mean stddev 9999.00 40.00 38075.00 11624304.00 1162.55 3934.22
real 0m26.478s user 0m24.600s sys 0m0.070s
I don't know why it takes so long time, perl can do it pretty quickly:
$ time head -n 10000 jmetersaso.log | cut -d, -f2 | perl -lne '$x += $; END { print $x; }' 11624304
real 0m0.022s user 0m0.010s sys 0m0.000s
Even just the --sum takes 1000 times as long as perl:
$ time head -n 10000 jmeter_saso.log | cut -d, -f2 | st --sum Invalid value 'elapsed' on input line 1 11624304.00
real 0m22.732s user 0m22.520s sys 0m0.020s
My files are 1.000.000 lines long, and have now used 22CPU minutes without being complete.
— Reply to this email directly or view it on GitHubhttps://github.com/nferraz/st/issues/12 .
Nelson Ferraz
Wow! That was fixed quickly!
10000 lines went from 9.7 seconds to 0.2 seconds for me. Perfect!
Thank you very much!
Hi,
thanks for a great tool - st is exactly what I need, except for the speed.
Up to 1000 lines of numbers in a file works ok:
$ time head -n 10 jmeter_saso.log | cut -d, -f2 | st N min max sum mean stddev 9.00 7578.00 19843.00 132073.00 14674.78 3632.56
real 0m0.207s user 0m0.170s sys 0m0.010s
$ time head -n 100 jmeter_saso.log | cut -d, -f2 | st N min max sum mean stddev 99.00 7578.00 35999.00 2372769.00 23967.36 5713.40
real 0m0.339s user 0m0.300s sys 0m0.020s
$ time head -n 1000 jmeter_saso.log | cut -d, -f2 | st N min max sum mean stddev 999.00 80.00 38075.00 7644960.00 7652.61 10007.16
real 0m2.375s user 0m2.280s sys 0m0.030s
But at 10.000 lines it starts getting really slow:
$ time head -n 10000 jmeter_saso.log | cut -d, -f2 | st N min max sum mean stddev 9999.00 40.00 38075.00 11624304.00 1162.55 3934.22
real 0m26.478s user 0m24.600s sys 0m0.070s
I don't know why it takes so long time, perl can do it pretty quickly:
$ time head -n 10000 jmetersaso.log | cut -d, -f2 | perl -lne '$x += $; END { print $x; }' 11624304
real 0m0.022s user 0m0.010s sys 0m0.000s
Even just the --sum takes 1000 times as long as perl:
$ time head -n 10000 jmeter_saso.log | cut -d, -f2 | st --sum Invalid value 'elapsed' on input line 1 11624304.00
real 0m22.732s user 0m22.520s sys 0m0.020s
My files are 1.000.000 lines long, and have now used 22CPU minutes without being complete.