the training process do not use BPTT

gds123 commented 6 years ago

the training process do not use BPTT I have implement a version use BPTT, but recall drop to 0.43

hidasib commented 6 years ago

Yes, the public version doesn't use BPTT. The reason is that it doesn't suit the data, in which the mojority of the sessions are very short (2-3 clicks) and the lengths are quite varied. I have a two BPTT versions in my private repo that overcomes the varying session length problem by (1) filtering short sessions and (2) by padding, respectively.

The version that uses filtering steadily decreases recommendation accuracy as I increase the window size. It seems that having more data is more important than learning a better sequence model based on long sessions.

The padding version performs similarly to the original algorithm - that is uploaded here - but trains slower due to the implementation of BPTT.

I might upload the padding version to the public repo later, but I haven't decided yet.

gds123 commented 6 years ago

@hidasib Thanks a lot for your explaination! What about sort the sessions by length first? In this way, the problem cause by padding is relieved Does it cause another problem that the distribution between batches is very diffierent?

hidasib commented 4 years ago

Some additional info about BPTT and the difference between session-based and sequential personalized recommendations was added to the readme. I will probably add the BPTT version (which has also been sped up significantly) to the public repo in the future for folks who use longer sequences, such as user histories. For now, I'm closing this issue.

hidasib / GRU4Rec

the training process do not use BPTT #22