Closed BruceLee66 closed 5 years ago
Hi BruceLee, you need install torchtext 0.1.1
chinese?Thanks a lot.Now i have a paraphrase identifity problem. After reading your paper,i have learnt a lot.
Yes :)
@BruceLee66 Hi BruceLee ! You can also write this function in util.py to load your pre-trained word embedding.
def load_word_vecs(path):
itos, vectors, dim = [], array.array(str('d')), None
with open(path, 'r') as f:
lines = [line for line in f]
for line in tqdm(lines, total=len(lines)):
# Explicitly splitting on " " is important, so we don't
# get rid of Unicode non-breaking spaces in the vectors.
entries = line.rstrip().split(" ")
word, entries = entries[0], entries[1:]
# print(word)
if dim is None and len(entries) > 1:
dim = len(entries)
elif len(entries) == 1:
continue
elif dim != len(entries):
raise RuntimeError(
"Vector for token {} has {} dimensions, but previously "
"read vectors have {} dimensions. All vectors must have "
"the same number of dimensions.".format(word, len(entries), dim))
vectors.extend(float(x) for x in entries)
itos.append(word)
stoi = {word: i for i, word in enumerate(itos)}
vectors = torch.Tensor(vectors).view(-1, dim)
dim = dim
return stoi, vectors, dim
because I use python3.6 and the code of torchtext 0.1.1 in python3.6 is different from the style of python2.7. If you use 3.6 you can use this function.
@caoxu915683474 这个方法能加载本地词向量?
@BruceLee66 是的,我是仿照torchtext 里的方法写的。
@caoxu915683474 请问你处理2个中文句子的时候,出现这个问题没有? model.py:317: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number pos1=torch.div(indix,simCube.size(2)).data[0]
@BruceLee66 我也报这个warning.
@caoxu915683474 那你找到错误没有?加我微信857243838
load_word_vectors 能加载本地的词向量吗?
可以的,保证你的词向量和glove格式一样就行:每一行是word + vector
Hi lanwuwei, 谢谢你的回答。 在使用load_word_vectors时,load_word_vectors(embedding_path, 'glove.840B', EMBEDDING_DIM) 会自动从 nlp.stanford.edu/data/ 下载数据。 怎么设置,才能加载本地的词向量?
Many thanks!
可以的,保证你的词向量和glove格式一样就行:每一行是word + vector
running main.py and it throw out "cannot import name 'load_word_vectors'" did not have this function?