finalfusion / finalfrontier

Context-sensitive word embeddings with subwords. In Rust.
https://finalfusion.github.io/finalfrontier
Other
87 stars 4 forks source link

Use discounted frequencies for negative sampling #178

Closed danieldk closed 2 years ago

danieldk commented 2 years ago

We used the Zipf distribution for negative sampling. However, using the empirical distribution gives better results in practice. This also brings the implementation closer to how word2vec and fastText sample negatives.

I have never found the approach of sampling from an item table very elegant (takes memory and is a source of cache misses). We had a more elegant approach in zWeightedRangeGenerator`, however it turned out to be slow in practice due to its use of binary search.

sebpuetz commented 2 years ago

I'll try to take a look tomorrow!

danieldk commented 2 years ago

We're keeping the zipf generator around, just not available in config?

I was thinking of removing it in a separate PR. We do have another unused generator, but the zipf generator adds dependencies, so perhaps we should just remove it?