maciejkula / spotlight

Deep recommender models using PyTorch.
MIT License
2.97k stars 421 forks source link

userid and itemid start from 1 #164

Open KylinA1 opened 5 years ago

KylinA1 commented 5 years ago

Hello,

I just notice that you start count id from 1, which lead to one more dimension abuse in both user and item. For example, the number of user and item in Movielens 1M is 6040 and 3706. Actually, your final processed dataset , including Scipy matrix is in 6041*3707 shape.

This might be a tiny problem.

snemistry commented 5 years ago

For sequence models, 0 item id is reserve as padding. That's why.

KylinA1 commented 5 years ago

Thanks fur your kind replies, that make sense.

BBiering commented 4 years ago

It seems a bit weird that user ids are implicitly assumed to start from 1 until N, with N = num users since 0 is reserved for padding but it raises an error if num_users = user_ids.max(). Same goes for item ids. Or have I missed something?

See spotlight/interactions.py at line 129: if self.user_ids.max() >= self.num_users: raise ValueError('Maximum user id greater ' 'than declared number of users.')