NatLibFi / Annif

Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
https://annif.org
Other
204 stars 41 forks source link

Doc2vec backend #239

Open osma opened 5 years ago

osma commented 5 years ago

Gensim provides an implementation of doc2vec. It could be a useful backend, similar to the current tfidf algorithm but smarter and it would maybe also allow online learning!

https://medium.com/scaleabout/a-gentle-introduction-to-doc2vec-db3e8c0cce5e https://radimrehurek.com/gensim/models/doc2vec.html

osma commented 5 years ago

Initial implementation on branch issue239-doc2vec-backend but the first results (with archaeology toy corpus) seemed nonsensical