SenticNet / personality-detection

Implementation of a hierarchical CNN based model to detect Big Five personality traits
http://sentic.net/deep-learning-based-personality-detection.pdf
MIT License
475 stars 167 forks source link

TypeError: expected str, bytes or os.PathLike object, not Word2VecKeyedVectors while loading GoogleNews-vectors-negative300-SLIM.bin file #34

Closed addy1997 closed 3 years ago

addy1997 commented 3 years ago

I have modified your process_data.py code. I am getting this error when I was trying to load 'GoogleNews-vectors-negative300-SLIM.bin'. (Code is given below)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-21-b567896a8aa6> in <module>
     14 print('vocab size: ' + str(len(vocab)))
     15 print('max sentence length: ' + str(max_l))
---> 16 w2v = load_bin_vec(wv_from_bin, vocab)
     17 print(w2v)
     18 print('word2vec loaded!')

<ipython-input-20-59822c213c28> in load_bin_vec(fname, vocab)
     49     """
     50     word_vecs = {}
---> 51     with open(fname, 'rb') as f:
     52         header = f.readline()
     53         vocab_size, layer1_size = map(int, header.split())

TypeError: expected str, bytes or os.PathLike object, not Word2VecKeyedVectors

Code is

w2v_file = 'GoogleNews-vectors-negative300-SLIM.bin'
revs, vocab = build_data_train_test(data_train, train_ratio=0.6, clean_string=True)
max_l = np.max(pd.DataFrame(revs)['num_words'])
print('data loaded!')
print('number of sentences: ' + str(len(revs)))
print('vocab size: ' + str(len(vocab)))
print('max sentence length: ' + str(max_l))
w2v = load_bin_vec(w2v_file, vocab)
print(w2v)
print('word2vec loaded!')
print('num words already in word2vec: ' + str(len(w2v)))

add_unknown_words(w2v, vocab)
W, word_idx_map = get_W(w2v)
cPickle.dump([revs, W, word_idx_map, vocab], open('imdb-train-val-testN.pickle', 'wb'))
print('dataset created successfully!')

Any help or guidance is highly appreciated.