How docspace model predict nth document based on clicked history

nickyongzhang commented 5 years ago

Refer to #75 and #207

According to the definition of trainMode=1, it seems the docspace model is not using the (n-1) documents to predict the nth document, but just predict any position of doc with the rest of the docs.

trainMode = 1: Each example contains a collection of labels. At training time, one label from the collection is randomly picked as the RHS, and the rest of the labels in the collection become the LHS.

I also find from training the pagespace model that order of the document seems does not matter when trainMode=1

$ ./query_predict /tmp/starspace/models/lastfm/user_artists 4
.....
Enter some text: A190 A199 A207
0[0.920611]: A199
1[0.56326]: A207
2[0.429776]: A7417
3[0.425536]: A424

Enter some text: A190 A207 A199
0[0.920611]: A199
1[0.56326]: A207
2[0.429776]: A7417
3[0.425536]: A424

I am wondering how I can really get the nth recommendation based on the previous clicked (n-1) documents, which is the experiment in the paper.

ledw commented 5 years ago

@nickyongzhang Hi, thanks for the comment. In training we do predict any position of doc with the rest of the docs for trainMode=1. In test time, you can enforce the model to predict the n-th recommendation based on the previous clicked (n-1) by changing the evaluation function. We did not add that function in, but it's similar to this: https://github.com/facebookresearch/StarSpace/blob/de71fb61bbc7871a98bde69828e34c794fa8b800/src/data.cpp#L112-L122 Instead of using a random doc to be the label, you can change that to always pick the last one as label.

nickyongzhang commented 5 years ago

Thank you for your response, @ledw.

Do you mean that the Content-based Document Recommendation experiment in the StarSpace paper also do predict any position of doc with the rest of the docs during the training process?

ledw commented 5 years ago

@nickyongzhang yes that is the case.

facebookresearch / StarSpace

How docspace model predict nth document based on clicked history #238