facebookresearch / StarSpace

Learning embeddings for classification, retrieval and ranking.
MIT License
3.94k stars 531 forks source link

PageSpace recommendation example #177

Closed loretoparisi closed 5 years ago

loretoparisi commented 6 years ago

Looking at the current example examples/recomm_user_artists.sh, that is using trainMode 1 and a pagespace model I'm not sure how to interpret the results. The predictions files is like

Example 0:
LHS:
A190 A199 A207 A217 A227 A228 A229 A293 A298 A300 A316 A333 A344 A403 A420 A464 A498 A538 A542 A546 A548 A679 A681 A683 A687 A691 A696 A704 A1047 A1048 A1090 A1106 A1246 A1400 A1470 A1492 A1502 A1632 A1976 A2121 A2277 A2407 A2977 A3373 A3400 A3461 A3730 A5978 A9079 
RHS: 
A291 
Predictions: 
(--) [0.474973] A229 
(--) [0.450735] A1492 
(--) [0.4379]   A683 
(--) [0.425732] A2277 
(--) [0.414445] A228 
(--) [0.413803] A65 
(--) [0.407297] A1400 
(--) [0.388807] A154 
(--) [0.381875] A689 
(--) [0.374245] A1043 

where the labels values should be the artist identifiers for a certain user, where "LHS" stands for left-hand-side and the label as "RHS" stands for right-hand-side, while in the train and test files I have like

$ head -n3 /tmp/starspace/data/lastfm/user_artists.train 
AartistID
A51 A52 A53 A54 A55 A56 A57 A58 A59 A60 A61 A62 A63 A64 A65 A66 A67 A68 A69 A70 A71 A72 A73 A74 A75 A76 A77 A78 A79 A80 A81 A82 A83 A84 A85 A86 A87 A88 A89 A90 A91 A92 A93 A94 A95 A96 A97 A98 A99 A100

that is a mapping created by the conversion script in the example that writes out a list of artist_id from the source dataset that has a format like

userID  artistID    weight
2   51  13883
2   52  11690
...
3   101 13176
3   102 662
...

so this will create

A51 A52
...
A101 A102
...

where each row/example defines a user implicitly and it contains the labels of the artist fanned by the user.

Back to the predictions file, where is the user id in the output file? Is the example number/row hence Example #5 stands for user 4 (since it starts from 0) that is the line 4 in the training set?

ledw commented 6 years ago

@loretoparisi Hi, yes your interpretation is correct as the user ID is not presented in the test file. We plan to add explicit user ID with embedding for this train mode so it is more clear.

loretoparisi commented 6 years ago

@ledw thanks. So assumed I'm using this version, I would like to infer the predictions on a single item. When working in FastText I'm doing this like fasttext predict-prob my_model - 3 to get the 3 most likely prediction for my supervised model. Does StarSpace support this stdin mode?

ledw commented 6 years ago

@loretoparisi Hi, the query_predict does provide functionality similar to the stdin mode you mentioned. Check it out and let us know if that works for your case.