chrismattmann / tika-similarity

Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
Apache License 2.0
107 stars 59 forks source link

Generate dense vector embeddings for metadata #80

Closed harsham05 closed 5 years ago

harsham05 commented 7 years ago

TODO: employ doc2vec

chrismattmann commented 5 years ago

closing no progress in 3 years