NatLibFi / Annif

Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
https://annif.org
Other
196 stars 41 forks source link

Merge results using NumPy operations #188

Closed osma closed 5 years ago

osma commented 5 years ago

Currently Annif merges hits using annif.util.merge_hits. It is used both by AnnifProject and the ensemble backend. The current implementation is unnecessarily complex. Nowadays we can use as_vector and from_vector to turn hits (AnalysisResult objects) into a NumPy array and back. The merging should be done with NumPy operations.

osma commented 5 years ago

Also, AnalysisResult.from_vector should retain the vector representation and only convert to a list if actually needed, in a lazy fashion. Similarly a list should not be converted to vector unless necessary. The from_vector method could be merged with the constructor, so that the new constructor can accept either a vector or a list (though with a vector, it will probably need additional parameters).