plasticityai / magnitude

A fast, efficient universal vector embedding utility package.
MIT License
1.63k stars 120 forks source link

Running most_similar on concatenated model #35

Closed prem6667 closed 5 years ago

prem6667 commented 6 years ago

Is there any way to run most_similar() on two concatenated models?

AjayP13 commented 6 years ago

Hi, this is not currently possible, but you could, of course, run most_similar on each individual model and then combine the results some how.

I've thought about adding this but am not sure how this operation would work as running it across two different concatenated models is an undefined operation that depends on what your models are representing. For example, how do you weigh each concatenated model or should they be weighted equally? When searching for most_similar, should it search across vectors where the keys intersect (inner join) between the two concatenated models or should it search across the vectors of the cartesian product of all of the keys from both concatenated models.

Could you tell me a little bit more about your use case and that will inform how I implement this if I do so in the future? What does each model you are concatenating represent and why do you want most similar across them?