david-cortes / contextualbandits

Python implementations of contextual bandits algorithms
http://contextual-bandits.readthedocs.io
BSD 2-Clause "Simplified" License
745 stars 146 forks source link

Context Understanding of the API #10

Closed wenruij closed 5 years ago

wenruij commented 5 years ago

The n_features in parameter X (array (n_samples, n_features)) of predict(X, exploit=False, gradient_calc='weighted') should be referred to as the context, summarizing information of both the user u and arm a. It should be a form like CONCAT<user_vec, arm_vec>.

If So,

may give a different actions prediction for the same user1.

That is confusing me. How should I understand the The n_features in parameter X (array (n_samples, n_features))?

david-cortes commented 5 years ago

The convention in other packages such as scikit-learn is to use array(a, b, c, ...) in the documentation to specify that it takes as parameter an array with dimensionality for each axis given by what’s inside the parentheses.

In this case, array(n_samples, n_features) means that it expects a 2-dimensional array with the first dimension being the number of samples/cases/observations (i.e. number of rows or number of cases you want to predict, as it can predict for more than 1 vector at a time), and the second being the number of features that the model uses (in your case I guess that’d be n_features = n_user_features + n_arm_features, but that’s not the only way to build a recommender system).

Some models might also work with higher-dimensional arrays (e.g. array(n_samples, n_user_features, n_arm_features)) if you’re using more complex models like neural networks though.

While this might seems obvious if you're used to scikit-learn-like API's, there's other functions in other packages that don't exactly work like that. For example, numpy's cov expects the first array dimension to match with the number of features and the second one to match with the number of observations, while others might expect them in triplets representations rather than as full matrices.

david-cortes commented 5 years ago

By the way, be aware that this package is based around the idea of having one model per arm rather than having arm features, so if there are no further divisions by which you can group arms, you might want to use a different package (don't know of any working with that kind of model with arm features unfortunately) or reimplement the algorithms yourself.

wenruij commented 5 years ago

@david-cortes Thanks for your comment.

Let me confirm again. we can pass only user features to the n_features in X (array (n_samples, n_features)) when doing fit(X, a, r). And that will train models for each arm.

david-cortes commented 5 years ago

@david-cortes Thanks for your comment.

Let me confirm again. we can pass only user features to the n_features in X (array (n_samples, n_features)) when doing fit(X, a, r). And that will train models for each arm.

Yes, that’s correct.

But following that logic there’s one exception though: if you have a situation in which arms are somehow naturally divided into different groups, and each group is constrained to present a pre-determined arm per round, you might still add arm features (e.g. if you are modeling users who can go to different cinemas to watch movies, but each cinema only plays 1 movie at a given time, which you know beforehand).

wenruij commented 5 years ago

much appreciated. get ur point.