Closed turicas closed 11 years ago
Some interesting data on worker durations:
Tokenizer
min=0.000355958938599
max=7.96267414093
avg=0.0746055768197
stddev=0.154531213476
POS
min=0.000236034393311
max=82.1013770103
avg=1.13472248689
stddev=1.60564613737
FreqDist
min=0.000233888626099
max=9.72356987
avg=0.0933163567759
stddev=0.181907154706
Extractor
min=0.000626087188721
max=5.59551620483
avg=0.027879508199
stddev=0.069522390211
Statistics
min=0.000164985656738
max=5.78841805458
avg=0.0614866020195
stddev=0.134460064482
It would be more relevant to have this information in units of bytes/sec instead of just seconds. Could you also share average and SD for file size? for the corpus which generated these stats?
On Wed, May 1, 2013 at 3:21 PM, Álvaro Justen notifications@github.comwrote:
Some interesting data on worker durations:
Tokenizer min=0.000355958938599 max=7.96267414093 avg=0.0746055768197 stddev=0.154531213476
POS min=0.000236034393311 max=82.1013770103 avg=1.13472248689 stddev=1.60564613737
FreqDist min=0.000233888626099 max=9.72356987 avg=0.0933163567759 stddev=0.181907154706
Extractor min=0.000626087188721 max=5.59551620483 avg=0.027879508199 stddev=0.069522390211
Statistics min=0.000164985656738 max=5.78841805458 avg=0.0614866020195 stddev=0.134460064482
— Reply to this email directly or view it on GitHubhttps://github.com/NAMD/ptwp_tagger/issues/2#issuecomment-17297142 .
+55(21) 3799-5567 Professor Escola de Matemática Aplicada Fundação Getulio Vargas Praia de Botafogo, 190 sala 312 Rio de Janeiro - RJ 22250-900 Brasil
I've created a script to parse broker's log (
parse_log.py
), now we need a code to plot a histogram based on data generated by this script so we can see how much time each worker takes to run.