facebookresearch / StarSpace

Learning embeddings for classification, retrieval and ranking.
MIT License
3.94k stars 531 forks source link

Understanding the output #200

Closed helgasvala closed 5 years ago

helgasvala commented 5 years ago

Hey! I have a really basic question that I can't seem to find an answer to. In the prediction files, I want to understand what the output means, like this pagespace (recomm_user_artists.sh) example from the README.

As an example, I get:

Example 27: LHS: A53 A56 A64 A65 A70 A80 A88 A154 A199 A203 A207 A210 A220 A228 A424 A432 A511 A709 A735 A969 A982 A1073 A1099 A1106 A1394 A1396 A1397 A1398 A1408 A1410$ RHS: A84 Predictions: (--) [0.602615] A1396 (--) [0.586171] A1408 (--) [0.566579] A1397 (--) [0.549347] A1398 (--) [0.505462] A70 (--) [0.504954] A1394 (--) [0.503935] A80 (--) [0.488138] A1409 (--) [0.483712] A56 (--) [0.480524] A15079

Sometimes (in other examples) you get (++) in front. My question is, what does the --/++ mean and what do the numbers mean? Percentage? Which kind?

Thank you so much.

ledw commented 5 years ago

@helgasvala Hi, the (++) means that it is correct prediction of RHS, (--) means it is not. The numbers are predicted scores for the item (could be cosine or dot product between LHS and RHS, depending which similarity measure you use. We'll update that in README.

helgasvala commented 5 years ago

Thanks! I had figured that out with the (++) and the (--) while doing topic classification, like here: https://towardsdatascience.com/learning-note-starspace-for-multi-label-text-classification-81de0e8fca53. But then, like in the sample above, all of the predictions (as far as I can see) have (--). So none of them are correct?

ledw commented 5 years ago

@helgasvala yes that means none of them are correct. It meant to be a example script to demonstrate how to use the pagespace so the parameters are not necessarily optimized.