fix evaluateFullyLabeled

First of all, thank you for creating this library! I ran into a few issues while trying to use the evaluateFullyLabeled function, which I've fixed in this PR. Let me know if you'd like me to make any changes to this PR.

I tried to use code similar to

from sklearn.preprocessing import LabelBinarizer
from sklearn.utils import shuffle

from contextualbandits.evaluation import evaluateFullyLabeled
from contextualbandits.online import PartitionedTS

dataset = fetch_covtype(shuffle=True, random_state=0)
lb = LabelBinarizer()
labels = lb.fit_transform(dataset.target)
X, y = shuffle(dataset.data, labels, n_samples=10_000, random_state=0)

policy = PartitionedTS(nchoices=y.shape[1], min_samples_leaf=100, criterion="entropy")
mean_rewards = evaluateFullyLabeled(policy, X, y, online=False, shuffle=True, update_freq=500, random_state=1)
print(mean_rewards)

but ran into a few issues with the evaluateFullyLabeled function.

The first thing I ran into was that it expects update_freq to be a boolean instead of an integer:

Traceback (most recent call last):
  File "/Users/bartl/contextualbandits/test/test.py", line 14, in <module>
    evaluateFullyLabeled(policy, X, labels, online=False, shuffle=True, update_freq=500, random_state=0)
  File "/Users/bartl/contextualbandits/contextualbandits/lib/python3.12/site-packages/contextualbandits/evaluation.py", line 307, in evaluateFullyLabeled
    assert isinstance(update_freq, bool)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

I fixed this by changing assert isinstance(update_freq, bool) to assert isinstance(update_freq, int)

The next error I ran into was:

Traceback (most recent call last):
  File "/Users/bartl/contextualbandits/test/test.py", line 14, in <module>
    evaluateFullyLabeled(policy, X, y, online=False, shuffle=True, update_freq=500, random_state=1)
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/contextualbandits/evaluation.py", line 335, in evaluateFullyLabeled
    for i in range(int(np.floor(features.shape[0]/batch_size))):
                                ^^^^^^^^
NameError: name 'features' is not defined

I fixed this by renaming features to X

Then I ran into this error:

Traceback (most recent call last):
  File "/Users/bartl/contextualbandits/test/test.py", line 14, in <module>
    evaluateFullyLabeled(policy, X, y, online=False, shuffle=True, update_freq=500, random_state=1)
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/contextualbandits/evaluation.py", line 335, in evaluateFullyLabeled
    for i in range(int(np.floor(X.shape[0]/batch_size))):
                                           ^^^^^^^^^^
NameError: name 'batch_size' is not defined

I fixed this by renaming batch_size to update_freq

The next issue I ran into was:

Traceback (most recent call last):
  File "/Users/bartl/contextualbandits/test/test.py", line 14, in <module>
    evaluateFullyLabeled(policy, X, y, online=False, shuffle=True, update_freq=500, random_state=1)
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/contextualbandits/evaluation.py", line 344, in evaluateFullyLabeled
    rewards_per_turn.append(rewards_per_turn.sum())
                            ^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'sum'

I fixed this by changing rewards_per_turn.append(rewards_per_turn.sum()) to: rewards_per_turn.append(batch_rewards.sum())

Which took me to the following error message:

Traceback (most recent call last):
  File "/Users/bartl/contextualbandits/test/test.py", line 14, in <module>
    evaluateFullyLabeled(policy, X, y, online=False, shuffle=True, update_freq=500, random_state=1)
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/contextualbandits/evaluation.py", line 350, in evaluateFullyLabeled
    policy.fit(X[:end,:], history_actions, y_onehot[np.arange(end), history_actions])
                                           ~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: arrays used as indices must be of integer (or boolean) type

I fixed this by changing history_actions = np.array([]) history_actions = np.array([], dtype=int)

I fixed this by adding history_actions = np.append(history_actions, batch_actions) after fitting the initial model

The final error I ran into was:

Traceback (most recent call last):
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/joblib/_utils.py", line 72, in __call__
    return self.func(**kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/joblib/parallel.py", line 598, in __call__
    return [func(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/contextualbandits/utils.py", line 971, in _decision_function_single
    preds[:, choice] = self.algos[choice].predict(X)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/contextualbandits/utils.py", line 1371, in predict
    pred_node = self.model.apply(X)
                ^^^^^^^^^^^^^^^^^^^
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/sklearn/tree/_classes.py", line 581, in apply
    X = self._validate_X_predict(X, check_input)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/sklearn/tree/_classes.py", line 489, in _validate_X_predict
    X = self._validate_data(
        ^^^^^^^^^^^^^^^^^^^^
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/sklearn/base.py", line 633, in _validate_data
    out = check_array(X, input_name="X", **check_params)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bartl/contextualbandits/lib/python3.12/site-packages/sklearn/utils/validation.py", line 1087, in check_array
    raise ValueError(
ValueError: Found array with 0 sample(s) (shape=(0, 54)) while a minimum of 1 is required by DecisionTreeClassifier.

I fixed this by changing the ofsets in

for i in range(int(np.floor(X.shape[0]/update_freq))):
     st=(i+1)*update_freq
     end=(i+2)*update_freq

to:

for i in range(1, int(np.floor(X.shape[0]/update_freq))):
     st=(i)*update_freq
     end=(i+1)*update_freq
     end=np.min([end, X.shape[0]])

david-cortes / contextualbandits

fix evaluateFullyLabeled #71