openpreserve / nanite

Nanite - a friendly swarm of format-identifying robots.
openplanets.github.io/nanite/
15 stars 13 forks source link

Make DroidDetector thread-safe #44

Open anjackson opened 2 years ago

anjackson commented 2 years ago

As noted in the README the DroidDetector is not thread-safe.

This is because we are re-using three DROID objects here:

https://github.com/openpreserve/nanite/blob/7c5424329236c8b261acf3b6c8a9f68ba4697478/nanite-core/src/main/java/uk/bl/wa/nanite/droid/DroidDetector.java#L128-L137

If I'm remembering correctly, the issue here is that creating instances of these classes is expensive, and involves mucking around like extracting signature files to temporary locations for loading. Instantiating these classes per-identification would be extremely slow and brittle.

It's not 100% clear if the underlying DROID objects are themselves thread-safe. Therefore, it might be simplest to make them ThreadLocal and take more care with the temporary file code so the per-thread instances don't collide. If they are threadsafe, we may be able to statically declare them, although I'm not sure how configuration would be affected.

Obviously, it'd be good to have a unit test that put the thread-safety under a reasonable amount of strain.

n.b. Issue raised following conversation with @tballison