oborchers / Fast_Sentence_Embeddings

Compute Sentence Embeddings Fast!
GNU General Public License v3.0
616 stars 83 forks source link

Hierarchical pooling #39

Closed lambdaofgod closed 2 years ago

lambdaofgod commented 3 years ago

Could you say something more about hierarchical pooling? I am interested in this feature, but I'm not sure what you mean. I can try to implement this if given some guidance.

oborchers commented 3 years ago

Hi @lambdaofgod ! Thanks for your interest! Hierarchical pooling means averaging over a subspan of the original tokens. For example:

Hello - how - are - you - doing avg1: (hello - how - are) / 3 avg2: (how - are - you) / 3 vec: (avg1 + avg2) / 2

I added this feature a while on the dev branch, but have to admit that the branch is somewhat abandoned. https://github.com/oborchers/Fast_Sentence_Embeddings/blob/develop/fse/test/test_pooling.py https://github.com/oborchers/Fast_Sentence_Embeddings/blob/develop/fse/models/pooling.py

The biggest problem was to get a good inheritance structure for the Cython files, so that there is only one core-routine and the internal compositional methods can be altered by model.

oborchers commented 2 years ago

Please see discussion #49