stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.81k stars 1.51k forks source link

what is the meaning of '.' in glove pretrained vector? #203

Closed BodyCSoulN closed 2 years ago

BodyCSoulN commented 2 years ago

Hi, thanks for your great job! I have downloaded the glove.840b.300d.txt. and i found that there are some '.' in the vector, what is meaning of it? Zero? image

AngledLuffa commented 2 years ago

you mean in the text or in the numbers? what's an example?

BodyCSoulN commented 2 years ago

you mean in the text or in the numbers? what's an example?

As you can see in the figure, there are two '.' in line 52344, what is the meaning of it?

AngledLuffa commented 2 years ago

Ah... is it possible this is a tokenization issue? Each column should have 301 entries, I believe, 1 word and 300 numbers. In this case, do you now have multiple "word" entries followed by 300 numbers?

BodyCSoulN commented 2 years ago

Oh, I am sorry. I left out the previous word and now one line represents a 300-dimensional vector. My question is: why does a '.' appear where a number should appear?

AngledLuffa commented 2 years ago

No idea, honestly. If it's now lined up correctly with 300 numbers at the end of each line, you're saying one of them is just "." instead of a number?

BodyCSoulN commented 2 years ago

I am so sorry bro.It is my fault. I found that the length of this line is 303. And this line starts with three '.' image

Thanks a lot for your patient answer!

So what's meaning of three '.' in the picture.

AngledLuffa commented 2 years ago

Thanks for following up!