Binary word2vec (GoogleNews-vectors-negative300.bin) Decoding Error

I am getting an exception when I use GoogleNews-vectors-negative300.bin with torchtext v0.2.3 and Python 3.6.

vectors = Vectors(name='GoogleNews-vectors-negative300.bin', cache='/directory/to/word2vec')

The exception is ValueError: could not convert string to float. For each line of the utf-8 binary word2vec file, torchtext currently splits it into word and the word vector like this:

entries = line.rstrip().split(b" " if binary_lines else " ")
word, entries = entries[0], entries[1:]

However, my entries after this block of code executes has a length of 3 for the first non-header line in GoogleNews-vectors-negative300.bin, which corresponds to </s>.

I propose we first decode each line and then split by " ". What do you think? Thanks!

pytorch / text

Binary word2vec (GoogleNews-vectors-negative300.bin) Decoding Error #338