facebookresearch / StarSpace

Learning embeddings for classification, retrieval and ranking.
MIT License
3.94k stars 531 forks source link

question on interpretation of negSearchLimit / maxNegSamples #204

Closed jwijffels closed 5 years ago

jwijffels commented 5 years ago

hello, I'm writing some documentation for the R wrapper in a vignette before I'll try to upload it to CRAN in January. I'd like to write something about the arguments negSearchLimit and maxNegSamples The docs indicate the following:

knipsel

If I look to the paper do I understand this correctly that if I'm in a multi-class classification settting and I have a bunch of texts (where each can have several labels) that for each text the positive entities are the ones which were labelled and the default is to sample for the negatives from the remaining labels 50 of them and from these 50 only keep 10? Or is there another interpretation of this L^batch (what is in this batch is it one text or more than 1 text) and this maxNegSamples? Does maxNegSamples correspond to k in the screenshot of the paper or is negSearchLimit k in the screenshot of the paper?

ledw commented 5 years ago

Hi @jwijffels, thanks for working on the R wrapper. For your question, it is an implementation detail that we did not include in the paper. The negSearchLimit means the number of negatives we sample during each batch, for the sampled candidates, some of them are 'real' negatives which makes the loss greater than 0. The maxNegSamples is a limit for 'real' negatives: we update at most maxNegSamples 'real' negatives each batch. In the screen shot, k correspond to negSearchLimit.

jwijffels commented 5 years ago

Thanks for the clarification that negSearchLimitis k in the paper To be sure on my interpretation of the answer on negSearchLimit can you

thisisandreeeee commented 5 years ago

@ledw Related to the interpretation of negSearchLimit and maxNegSamples, with regards to tuning these parameters.

Could you provide some intuition on how one might go about tuning these parameters, and the expected effects from tuning?

Based on my understanding of negSearchLimit, you're basically considering a larger number of negative samples when deciding how to optimise within a batch. Too few negative samples, and it will be difficult for starspace to differentiate between the positive and negative samples. Too many negative samples, and there's too much noise. Is this understanding accurate?

I don't really understand what maxNegSamples does though, nor do I understand what makes a negative "real".

jwijffels commented 5 years ago

Ok I did some inspection of the code myself

tolliam commented 1 year ago

Just following up on this, could I ask whether the negSearchLimit parameter is negative samples per document or for all documents being trained on. Ie will there be 50 negative docs with wrong labels or 50xtraining set size negative docs in total? I'm using ruimtehol and don't understand the above and references to "batches" in this thread. Many thanks

jwijffels commented 1 year ago

Just following up on this, could I ask whether the negSearchLimit parameter is negative samples per document or for all documents being trained on. Ie will there be 50 negative docs with wrong labels or 50xtraining set size negative docs in total? I'm using ruimtehol and don't understand the above and references to "batches" in this thread. Many thanks

It's not per document or for all documents, it is per mini-batch in Starspace STARSPACE-2018-2. A mini-batch is a sample of text which can be words / documents / labels depending on the training mode for which the negatives will always be the same. In STARSPACE-2017-2 (which R wrapper ruimtehol is using), the concept of mini-batch is not implemented and hence it is a sample for of text which can be words / documents / labels depending on the training mode.

tolliam commented 1 year ago

Ok many thanks. I'm using embed_tagspace so to check I'm understanding correctly then the corpus of documents and labels gets split in to samples of batchSize? And then negSearchLimit negatives will be taken but maxNegSamples in practice will cap that?

jwijffels commented 1 year ago

Ok many thanks. I'm using embed_tagspace so to check I'm understanding correctly then the corpus of documents and labels gets split in to samples of batchSize? And then negSearchLimit negatives will be taken but maxNegSamples in practice will cap that?

If this is a question related to ruimtehol, please put the question there.