Why does increasing the negatives sampled result in a lower mean rank?

I want to optimise the performance of a pagespace model. I've tried tuning StarSpace to sample more negatives (doubled the negSearchLimit and maxNegSamples), and observed an improvement in hits@k as expected. However, the mean rank dropped considerably.

What is the intuition behind this? Given a reasonably large dataset, shouldn't sampling more negatives create a more informative set of embeddings, leading to an increase in the mean rank?

Baseline params: -negSearchLimit 50 -maxNegSamples 10

hit@1: 0 hit@10: 0.120935 hit@20: 0.181233 hit@50: 0.275745 mean ranks : 4724.74 Total examples : 2952

Tuned params: -negSearchLimit 100 -maxNegSamples 20

hit@1: 0 hit@10: 0.128049 hit@20: 0.190379 hit@50: 0.295393 mean ranks : 5476.62 Total examples : 2952

facebookresearch / StarSpace

Why does increasing the negatives sampled result in a lower mean rank? #266