gsbDBI / torch-choice

Choice modeling with PyTorch: logit model and nested logit model
MIT License
39 stars 8 forks source link

More items than purchase records (i.e. unchosen items) #35

Closed giheungkim closed 1 year ago

giheungkim commented 1 year ago

Thank you so much for the package.

In my setting I have more purchase records than there are items on the market, which means that some items are left unchosen.

I think this variation is important in estimating the choice model, but I do not see how this gets passed into ChoiceDataset object.

It seems like it is limiting item numbers to be the ones that are seen in the purchase record.

Is there a fix to this?

Thank you!

TianyuDu commented 1 year ago

Hi Gi,

Thanks for reaching out.

Based on my understanding, you are facing the situation that, say in your data, you have 500 items indexed by 0, 1, ..., 499. However, some items, say 50, are never chosen (i.e., they never appear in the item_index array). In this case, you can specify num_items=500 in the initialization call of ChoiceDataset so that it knows there are 500 items.

Besides, if you are using item-specific, item-session-specific, or user-item-session-specific observable tensors, you still need to provide observable of these items never chosen. For instance, suppose you have 500 items in total and 50 items were never chosen, your item observable tensors should have shape (500, *) instead of (450, *)

Hope this helps and please let me know if you have further questions.

giheungkim commented 1 year ago

This totally works thank you!

TianyuDu commented 1 year ago

Glad that it helps!