Closed andreiamatuni closed 7 years ago
In the code snippet you have pasted, W is not being updated. Rather vec_result
is being set to the sum of it's word vectors. Would you agree?
vec_result is a reference to an element within W. When you update vec_result by summing the vectors, you're updating the first vector within W (at index vocab[term]) since that's what the reference points to. If vec_result was assigned as a copy, then what you said would be true, but as is, it's a basic reference assignment.
Ah, good catch! This was a user contribution, and unfortunately we didn't catch that bug. Fix should be here: https://github.com/stanfordnlp/GloVe/commit/c0d838f86c4d14c7ea9af647ca869291058ba8c0
Does that work?
yup, thanks!
Are the vectors in W supposed to be updated depending on inputs passed to the distance function?
For example, if I pass in "car", the cosine similarity with "stroller" is 0.1729. If I then pass in "car stroller", and then just "car" again, then the cosine similarity with "stroller" is now 0.765.
Running the python code through a debugger, it looks like the vectors for "car" and "stroller" in W are updated during the distance function call with the input containing multiple words. Is this supposed to be happening?
from eval/python/distance.py:
when you initialize the first
vec_result
, and then add to it in theelse:
branch, you're updating the first vector in theW
ndarray itself sincevec_result
is a reference (not a copy).