As part of the series of experiments conducted as a part of my work, I have been dealing with the following problem:
When making predictions, I would like to restrict the globally trained model to remove some of the items ids, which may not be available to a particular user, or which he or she has watched.
The simplest idea to that problem was to just not pass said id's to predict method and knowing that numerical values may vary, but the order of results shouldn't take what is given. The same should also be doable in the postprocessing step, but with extra computations: just remove offending items from the list containing predictions for all of the items.
But here is a catch: when doing so results are vastly different from each other. Not only top N items differ in order, but also some of the items, which in the unrestricted setting are deemed quite unlikely to be picked up (being placed quite far from top N) are now in the most recommended spots. This behavior can be seen here:
Unrestricted predictions (for a particular user, all movie id's)
Restriction imposed at postprocessing step:
Restriction by id's removal:
Ratings of top N (30) movies after id's removal in the unrestricted setting:
To clarify: as can be seen not many of the "restricted" movie ids were in original top 30, so those lists should be quite similar.
Expected behavior: there shouldn't be any difference between "ways" of removing movies from prediction other than numerical values of predicted "ratings", or in TL:DR format: relative order of items should not differ between them when set of same id's is present.
Or am I missing something here? I would be truly grateful for this puzzling behavior. Thanks in advance!
As part of the series of experiments conducted as a part of my work, I have been dealing with the following problem: When making predictions, I would like to restrict the globally trained model to remove some of the items ids, which may not be available to a particular user, or which he or she has watched.
The simplest idea to that problem was to just not pass said id's to predict method and knowing that numerical values may vary, but the order of results shouldn't take what is given. The same should also be doable in the postprocessing step, but with extra computations: just remove offending items from the list containing predictions for all of the items.
But here is a catch: when doing so results are vastly different from each other. Not only top N items differ in order, but also some of the items, which in the unrestricted setting are deemed quite unlikely to be picked up (being placed quite far from top N) are now in the most recommended spots. This behavior can be seen here: Unrestricted predictions (for a particular user, all movie id's) Restriction imposed at postprocessing step: Restriction by id's removal: Ratings of top N (30) movies after id's removal in the unrestricted setting: To clarify: as can be seen not many of the "restricted" movie ids were in original top 30, so those lists should be quite similar.
Expected behavior: there shouldn't be any difference between "ways" of removing movies from prediction other than numerical values of predicted "ratings", or in TL:DR format: relative order of items should not differ between them when set of same id's is present.
Or am I missing something here? I would be truly grateful for this puzzling behavior. Thanks in advance!