google-research / recsim

A Configurable Recommender Systems Simulation Platform
https://github.com/google-research/recsim
Apache License 2.0
732 stars 127 forks source link

Clarification of satisfaction/engagement tradeoff in choc-kale example #18

Closed rmitsch closed 3 years ago

rmitsch commented 3 years ago

When running the chocolate/kale example, I select a slate following a deterministic kaleness-first selection policy - i.e. I order document observations descendingly by their kaleness and then define the action as the first slate_size items:

action = tuple(np.argsort([do[0] for do in document_observations]))[:slate_size]

document_observations is observation["doc"], so e.g. (array([0.57019677]), array([0.43860151]), ..., array([0.46631077]).

For comparison I run the environment with the reverse policy, which is to select the action by picking the documents with the lowest kaleness:

action = tuple(np.argsort([do[0] for do in document_observations]))[::-1][:slate_size]

If I compare both policies after running them for a couple hundred steps, the kaleness-first policy yields higher engagement and lower user satisfaction than the chocolateness-first policy. I would expect exactly the opposite, since kaleness is supposed to boost user satisfaction at the cost of lower engagement.

Why does selecting items with the highest kaleness yield a lower user satisfaction and higher engagement than selecting items with the lowest kaleness?

cwhsu-google commented 3 years ago

Thank you for using RecSim and sorry for the bug in the chocolate/kale example. The following is the correct version of generate_response(). As you can see we mess up kale_mean and choc_mean here but the code in https://github.com/google-research/recsim/blob/master/recsim/environments/long_term_satisfaction.py is correct.

def generate_response(self, doc, response): response.clicked = True

linear interpolation between choc and kale.

engagement_loc = (doc.kaleness * self._user_state.kale_mean

rmitsch commented 3 years ago

No, that's it. Thanks for your response!