WENGSYX / Chinese-Word2vec-Medicine

Chinese Word2vec Medicine,中文医学词向量
152 stars 22 forks source link

直接使用medical.txt,出现维度问题 #12

Open Josoope opened 9 months ago

Josoope commented 9 months ago

model = KeyedVectors.load_word2vec_format('./dict/Medical.txt', binary=False) sim = model.wv.most_similar('海马', topn = 10) print(sim)

报错信息: return _load_word2vec_format( File "/root/miniconda3/envs/bert-ch/lib/python3.8/site-packages/gensim/models/keyedvectors.py", line 2069, in _load_word2vec_format _word2vec_read_text(fin, kv, counts, vocab_size, vector_size, datatype, unicode_errors, encoding) File "/root/miniconda3/envs/bert-ch/lib/python3.8/site-packages/gensim/models/keyedvectors.py", line 1975, in _word2vec_read_text _add_word_to_kv(kv, counts, word, weights, vocab_size) File "/root/miniconda3/envs/bert-ch/lib/python3.8/site-packages/gensim/models/keyedvectors.py", line 1911, in _add_word_to_kv word_id = kv.add_vector(word, weights) File "/root/miniconda3/envs/bert-ch/lib/python3.8/site-packages/gensim/models/keyedvectors.py", line 562, in add_vector self.vectors[target_index] = vector ValueError: could not broadcast input array from shape (330,) into shape (512,) 请问怎么修改

WENGSYX commented 9 months ago

model = KeyedVectors.load_word2vec_format('./dict/Medical.txt', binary=False, vector_size=330)

Josoope commented 9 months ago

添加了之后还是出错TypeError: load_word2vec_format() got an unexpected keyword argument 'vector_size'