Planeshifter / node-word2vec

Node.js interface to the Google word2vec tool.
Apache License 2.0
348 stars 55 forks source link

Had to remove arr.pop() on line 373 in model.js to get this library working #8

Open josephrocca opened 8 years ago

josephrocca commented 8 years ago

This line here was removing the last item from each vector which made their lengths different which caused a whole bunch of chaos down the line (multiplying by undefined).

I have no idea why it's there, but I did notice that it works for the example vector.txt in this project. Maybe something to do with \r and \n?

Also I added this:

if(isNaN(words) || isNaN(size)) {
  throw new Error("First line of input text file should be <number of words> <length of vector>. See example data 'vectors.txt' in repo");
}

After this line since that caused me a lot of trouble (I don't think it's mentioned anywhere in the readme).

Thanks for the awesome library!

Planeshifter commented 8 years ago

Thanks for this report, I will look into it. If I recall correctly, I indeed put the .pop() call there to remove newline and carriage returns from the array. Regarding your isNaN checks, any call to word2vev should write a file where the first line consists of <number of words> <length of vector>. I guess you have been using the library differently?

josephrocca commented 8 years ago

Oh, right, yes I was using a pretrained file that I found on the internet. I just added the <number of words> <length of vector> manually.

Also, sorry to pester, but did you see this issue on your node-wordnet-magic project?

Planeshifter commented 8 years ago

Yeah, sorry for not coming back to that earlier. I will have a look today into what's wrong with the WordNet module.

josephrocca commented 8 years ago

Thanks :+1: