Option for using vector summation instead of averaging for pooling

facebookresearch / StarSpace

Learning embeddings for classification, retrieval and ranking.

MIT License

3.94k stars 528 forks source link

Option for using vector summation instead of averaging for pooling #123

Closed matthias-samwald closed 6 years ago

matthias-samwald commented 6 years ago

I wonder if it would be easily possible to add an option for using summation instead of averaging of token vectors for the pooling operation? Perhaps this might be advantageous for some use-cases.

(e.g., I am thinking of knowledge graph / ontology embeddings, where more complex concepts/statements are built from simpler concepts/statements. With summation, the more numerous complex concepts/statements would be projected outwards in a tree-like structure growing away from the origin, while with averaging, they would need to be wedged between the simpler concepts)

jaseweston commented 6 years ago

yes, we can do this by simply changing the normalization scheme. also we can now specify weights per feature (how do those work with normalization) @ledw ?

ledw commented 6 years ago

@matthias-samwald @jaseweston yes we can simply add that. The weights per features works with different normalization: we sum up weight * feature embedding then normalize it by averaging or L2 normalization (for cosine similarity).

ledw commented 6 years ago

@matthias-samwald you can try to use the sum for pooling by specifying -p 0. The parameter p specifies the normalization in the following form: sum(embeddings)/(size(embeddings)^p)

When p=0, it is equivalent as the sum of embeddings. Closing the task for now, please let me know if that works for you.