artetxem / vecmap

A framework to learn cross-lingual word embedding mappings
GNU General Public License v3.0
642 stars 130 forks source link

A problem about broadcast #11

Closed lulu0-0 closed 6 years ago

lulu0-0 commented 6 years ago

Hello. I have met a problem when trying normalize_embeddings.py as follows:

Traceback (most recent call last): File "normalize_embeddings.py", line 52, in main() File "normalize_embeddings.py", line 33, in main words, matrix = embeddings.read(f) File "/home/ali/vecmap/embeddings.py", line 31, in read matrix[i] = np.fromstring(vec, sep=' ', dtype=dtype) ValueError: could not broadcast input array from shape (299) into shape (300)

I have re-installed these requirements several times and searched a lot but still confused to find what causes this problem. ToT Hope to get some help,thanks a lot!

artetxem commented 6 years ago

Looks like a problem when parsing the embedding file, either the word part is missing or you are using the wrong encoding. Try to identify what line that is causing the problem in your embedding file and check if something looks wrong in it.

lulu0-0 commented 6 years ago

Thank u very much! :) I have solved the problem by reducing the embedding size.