There is some room for improvement in the way Knowntools uses the database. In quick tests, analysis time of 1000 identical samples would go up from 14s with an empty database to 1m15s with 3000 analyses of this sample already in the database. This seemed to scale linearly, which makes some sense since the sample journal to be downloaded from the database grows linearly as well.
Also, we have different ways of referring to the current time between database, Knowntools and other analysers. I propose to haromise them to UTC and "aware" datetime objects at this opportunity.
There is some room for improvement in the way Knowntools uses the database. In quick tests, analysis time of 1000 identical samples would go up from 14s with an empty database to 1m15s with 3000 analyses of this sample already in the database. This seemed to scale linearly, which makes some sense since the sample journal to be downloaded from the database grows linearly as well.
Also, we have different ways of referring to the current time between database, Knowntools and other analysers. I propose to haromise them to UTC and "aware" datetime objects at this opportunity.
A version on top of #210 (which I actually developed on) is at https://github.com/michaelweiser/PeekabooAV/tree/efficient-knowntools.