Closed bakszero closed 6 years ago
I had a similar problem earlier and that was caused by an incomplete file.
@uduse what do you mean by an incomplete file? I've just downloaded the glove zip and unzipped it. I've also verified that all vectors have 300 items in that file. What else could be missing?
@theSage21 somehow my file is truncated so some of the vectors aren't complete. So you're having the same problem?
Here's what i did to verify all vectors are 300 dim.
with open('glove.txt', 'r') as fl:
for line in fl.readlines():
w, v = line.split(' ', 1)
assert (v.strip().split(' ')) == 300
This ran without errors on my machine so I assume that all vectors have 300 dims. Despite this I have the same traceback.
I downloaded glove from https://nlp.stanford.edu/data/glove.840B.300d.zip
@theSage21 try w, v = line.strip().split(' ', 1)
I doubt there's a weird char that got stripped away
It seems that one of the vectors in the Glove file is for a space (line 142319) which gets removed when we strip() causing a total of 300 items in the parts list instead of 301. I've submitted a fix for this in https://github.com/zhiguowang/BiMPM/pull/26
@zhiguowang does this seem ok?
@theSage21 the project isn't maintained. I wouldn't expect a merge but it's nice to have for anyone who has the same problem.
I'm using Glove pre-trained vectors (840B, 300D, Common Crawl). When I try to train the system, I get the following error:
The format of the Glove txt file I'm using is: < word > < 1st dim.value > < 2nd dim. value > ..... < 300th dim. value > (without the relational signs of course)