artetxem / vecmap

A framework to learn cross-lingual word embedding mappings
GNU General Public License v3.0
645 stars 130 forks source link

Projecting into english space #1

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hello, I am using this framework to test by model. So initially I tried to project into "it" space and got 38.75 as the word translation accuracy. But when I project into "en" space, the accuracy increased to 40.93. So I just want to ask a question - Did you try projecting into "en" space? if yes, what was the accuracy? and if no, why?

artetxem commented 7 years ago

Under the orthogonality constraint, which was the focus of our work, mapping the target language into the source language is completely equivalent to mapping the source language into the target language. This does not apply to unconstrained mappings, for which we followed the standard practice in the literature and mapped the source language into the target language. We did not try inverting this direction, but your positive results suggest that it could be an interesting direction to explore.

ghost commented 7 years ago

I think I need to think of this statement "mapping the target language into the source language is completely equivalent to mapping the source language into the target language" once again. Any way, thanks for the answer. UPD: This is my exact result:

ORIGINAL EMBEDDINGS
  - EN AN  |  Coverage: 64.98%  Accuracy: 76.66% (sem: 79.66%, syn: 75.36%)
  - IT AN  |  Coverage: 49.69%  Accuracy: 58.23% (sem: 56.98%, syn: 58.76%)
--------------------------------------------------------------------------------
MY MODEL
  - EN-IT  |  Coverage:100.00%  Accuracy: 40.93%
  - EN AN  |  Coverage: 64.98%  Accuracy: 76.59% (sem: 79.63%, syn: 75.27%)
  - IT AN  |  Coverage: 49.69%  Accuracy: 49.87% (sem: 48.93%, syn: 50.27%)
artetxem commented 7 years ago

Note that the statement in question only applies to orthogonal mappings. In that case, if W is the optimal mapping from X to Z, one can show that W^T must be the optimal mapping from Z to X, so the bilingual embeddings A=[XW;Z] and B=[X;ZW^T] are completely equivalent up to an orthogonal transformation (A=BW). Given that orthogonal transformations preserve the inner product and, therefore, the cosine similarity, Euclidean norm and Euclidean distance, A and B will behave in the exact same manner for us, so we can conclude that the direction of the learning is irrelevant . Even if you don't follow the maths, this is something that you can easily verify empirically.

ghost commented 7 years ago

Yeah got it, thanks.