Open-Book-Genome-Project / sequencer

A toolchain of tasks for sequencing and fingerprinting book fulltext
https://bookgenomeproject.org
43 stars 14 forks source link

Add basic profiling and timing #32

Closed finnless closed 3 years ago

finnless commented 3 years ago

It would be nice to get some performance characteristics as another result for each genome sequence.

finnless commented 3 years ago

36 Added initial timing. Still needs to be added to all modules and needs to be saved to results instead of STDOUT following this schema:

{
'total_time': 20.0,
'_memoize_xml': {
'time': 1.0
'kb': 200.0
}
'_memoize_plaintext': {
    'time': 1.0
'kb': 200.0
},
'2grams': {
'total_time': 1.0
        'term_freq': {
'time': 1.0,
'results': []
}
},
    '1grams': {
        'total_tokens’: 6,
        'total_time’: 3.0,
'term_freq': {
'time': 1.0
'results': [
('wrong', 1), ('we', 1), ('warranty', 1), ('visit', 1), ('to', 2), ('this', 1),
],
'isbns': {
'time': 1.0
'results': [],
}
'urls': {
'time': 1.0,
'results': [('http://flossmanualsnet', None)]
}
},
'readinglevel': {
'flesch_kincaid_grade': {
'time': 1.0,
'results': 6.2
}
},
'pagetypes': {
'copyright_page': {
'time': 1.0,
'results': ['2010']
}
},
'version': '0.0.36'
}