jhlau / doc2vec

Python scripts for training/testing paragraph vectors
Apache License 2.0
640 stars 191 forks source link

Document Vectors #12

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hi, what labels should be used to reference a document vector in your pre-trained doc2vec model? Despite the len(m.docvecs) = 35556952: m.docvecs[0], leads to IndexError: list index out of range.

Thanks

jhlau commented 7 years ago

Hi, the model doesn't save the document vectors. They were there originally but was removed due to space and plus I don't see why anyone would want a random document vector.

The pre-trained model is designed so you can infer document vectors for documents that you're interested in. The infer_test.py code gives an example how to load the pre-trained model and infer the vector for a new document.

ghost commented 7 years ago

Ok, thanks!