samoturk / mol2vec

Mol2vec - an unsupervised machine learning approach to learn vector representations of molecular substructures
BSD 3-Clause "New" or "Revised" License
256 stars 112 forks source link

Error when loading model #3

Closed sbhttcha closed 6 years ago

sbhttcha commented 6 years ago

Hi, Thanks for putting together the notebook to explore mol2vec. I am getting the following error however when loading the model using model = KeyedVectors.load('model_300dim.pkl')

AttributeError: 'Word2Vec' object has no attribute 'vocabulary'

I am using gensim v 3.3.0.

Also, I noticed there are 2 versions of the 'model_300dim.pkl' file, one is around 25 Mbs and another around 74 Mbs. Which one should be used? I have tried both versions and see the same error. Thanks for any help!

samoturk commented 6 years ago

Hi, I'm assuming it's the problem with gensim version. The model was trained with version 3.0.

I updated the model because the first one had actually 100 dimensional embeddings.

samoturk commented 6 years ago

Sabrina pointed out that you are not loading the model correctly. I tested it and the model is still working with gensim 3.4 when you load it like this:

from gensim.models import word2vec

model = word2vec.Word2Vec.load('mol2vec/examples/models/model_300dim.pkl')
sbhttcha commented 6 years ago

Thanks, that works with the recent versions of gensim.

FYI - another error folks may see if they are using sklearn 0.19.0 is with the cosine similarity metric in TSNE. updating sklearn to 0.19.1 fixes the problem.

samoturk commented 6 years ago

Thanks for the feedback!