Closed inigo-jauregi closed 7 years ago
Hi,
I am trying to use your code and to test it with the toy data. However, the pretrained_emb argument is not recognized. This is the code:
`#python example to train doc2vec model (with or without pre-trained word embeddings) import gensim.models as g import logging #doc2vec parameters vector_size = 300 window_size = 15 min_count = 1 sampling_threshold = 1e-5 negative_size = 5 train_epoch = 100 dm = 0 #0 = dbow; 1 = dmpv worker_count = 1 #number of parallel processes #pretrained word embeddings pretrained_emb = "toy_data/pretrained_word_embeddings.txt" #None if use without pretrained embeddings #input corpus train_corpus = "toy_data/train_docs.txt" #output model saved_path = "toy_data/model.bin" #enable logging logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO) #train doc2vec model docs = g.doc2vec.TaggedLineDocument(train_corpus) model = g.Doc2Vec(docs, size=vector_size, window=window_size, min_count=min_count, sample=sampling_threshold, workers=worker_count, hs=0, dm=dm, negative=negative_size, pretrained_emb=pretrained_emb,dbow_words=1, dm_concat=1, iter=train_epoch) #save model model.save(saved_path)`
And this is the error:
Traceback (most recent call last): File "C:/Users/12714818_Admin/Desktop/CMCRC/Boundlss_2017/May-Aug/Context_including/Conversation_clustering/src/train_model.py", line 31, in <module> model = g.Doc2Vec(docs, size=vector_size, window=window_size, min_count=min_count, sample=sampling_threshold, workers=worker_count, hs=0, dm=dm, negative=negative_size, pretrained_emb=pretrained_emb,dbow_words=1, dm_concat=1, iter=train_epoch) File "C:\ProgramData\Anaconda3\envs\py27\lib\site-packages\gensim\models\doc2vec.py", line 625, in __init__ **kwargs) TypeError: __init__() got an unexpected keyword argument 'pretrained_emb'
I am using python 2.7.
As mentioned on the README:
Gensim: Best to use my forked version of gensim; the latest gensim has changed its Doc2Vec methods a little and so would not load the pre-trained models.
https://github.com/jhlau/gensim
Hi,
I am trying to use your code and to test it with the toy data. However, the pretrained_emb argument is not recognized. This is the code:
And this is the error:
Traceback (most recent call last): File "C:/Users/12714818_Admin/Desktop/CMCRC/Boundlss_2017/May-Aug/Context_including/Conversation_clustering/src/train_model.py", line 31, in <module> model = g.Doc2Vec(docs, size=vector_size, window=window_size, min_count=min_count, sample=sampling_threshold, workers=worker_count, hs=0, dm=dm, negative=negative_size, pretrained_emb=pretrained_emb,dbow_words=1, dm_concat=1, iter=train_epoch) File "C:\ProgramData\Anaconda3\envs\py27\lib\site-packages\gensim\models\doc2vec.py", line 625, in __init__ **kwargs) TypeError: __init__() got an unexpected keyword argument 'pretrained_emb'
I am using python 2.7.