I want to optimise the performance of a pagespace model. I've tried tuning StarSpace to sample more negatives (doubled the negSearchLimit and maxNegSamples), and observed an improvement in hits@k as expected. However, the mean rank dropped considerably.
What is the intuition behind this? Given a reasonably large dataset, shouldn't sampling more negatives create a more informative set of embeddings, leading to an increase in the mean rank?
I want to optimise the performance of a pagespace model. I've tried tuning StarSpace to sample more negatives (doubled the negSearchLimit and maxNegSamples), and observed an improvement in hits@k as expected. However, the mean rank dropped considerably.
What is the intuition behind this? Given a reasonably large dataset, shouldn't sampling more negatives create a more informative set of embeddings, leading to an increase in the mean rank?
Baseline params:
-negSearchLimit 50 -maxNegSamples 10
Tuned params:
-negSearchLimit 100 -maxNegSamples 20
Related issue: https://github.com/facebookresearch/StarSpace/issues/204