Making DIEN dataset - Githubissues

Describe the question(问题描述) As I am processing data to use DIEN model, I reckon that data should have some different format compared to DeepFM due to the user behavior sequence list if I am correct. Because we will have sequence behavior (item user click history), I guess the dataset should be like one user one row? For example, it would be like below.

user_id      item_sequence    target_ad
0             [20, 30, 22]     3
1             [11, 45, 2]       10
2             [77, 35, 64]     4
3             [20, 30, 22]     7
4             [20, 30, 22]     16
5             [20, 30, 22]      1

But in DeepFM case, we do not use user behavior sequence, so many rows can have same user ID I guess? The example of what I am saying is as below: (user 1 and 5 have multiple rows)

user_id      clicked_item_id    target_ad
1             5                                3
1             6                               10
1             5                                4
5             8                                7
5             11                             16
9             2                                1

So in general, DIEN dataset would have number of row = number of user in this case whereas DeepFM can have arbitrary number of row as long as data exists?

And as we have to put target ad according to DIEN paper, can I take out the last sequence of original item_sequence and put it as target ad? Because with sequence history, the last item sequence should be predicted if it was classification problem.

shenweichen / DeepCTR-Torch

Making DIEN dataset #237