Right now, the pooling method pools the embedding tokens to a fixed ratio.
We should allow users to set a threshold on the cosine similarity up to which the tokens can be merged.
This should allow us to adapt more to the data by compressing more sequences with a lot of redundant information and fewer the ones that do not.
Right now, the pooling method pools the embedding tokens to a fixed ratio. We should allow users to set a threshold on the cosine similarity up to which the tokens can be merged. This should allow us to adapt more to the data by compressing more sequences with a lot of redundant information and fewer the ones that do not.