pmixer / SASRec.pytorch

PyTorch(1.6+) implementation of https://github.com/kang205/SASRec
Apache License 2.0
349 stars 93 forks source link

Doubt about the evaluation process #14

Open leehommlee opened 3 years ago

leehommlee commented 3 years ago

First of all, many thanks to the author for providing the Pytoch version of the code. Although the author says that some parts are the same as in original tensorflow implementation, I still have doubts about the evaluation process. The 100 parameter is set at line 135 of sasrec.pytorch /utils.py for _ in range(100): t = np.random.randint(1, itemnum + 1) while t in rated: t = np.random.randint(1, itemnum + 1) item_idx.append(t) I think this code should be generating a candidate set with the number of POIs in length, but the source code did not. The recommended performance is too high due to setting 100. If you also have doubts, you can leave me a message.

pmixer commented 3 years ago

the number of POIs

Pls check

https://github.com/kang205/SASRec/blob/e3738967fddab206d6eeb4fda433e7a7034dd8b1/util.py#L111

https://github.com/pmixer/SASRec.pytorch/issues

for previous discussions, it is not my business to fix perf is too high issue, pls consider checking https://github.com/pmixer/TiSASRec.debug if you need to remove negative sampling from evaluation code.

We all agreed that the negative sampling approach used currently has potential problems and should not be used in industry, while, it's just an academic paper for introducing mha into seq rec, and not focusing on evaluation metrics like https://www.kdd.org/kdd2020/accepted-papers/view/on-sampled-metrics-for-item-recommendation .

BTW, could u pls elaborate in the details of the number of POIs in length? It sounds interesting.