Closed ayush488 closed 2 years ago
Yes, the BanditRecommender has a predict_expections that can be used to return the expected rewards for any of the supported bandits.
The scores returned by the score pipeline function are transformed to be between 0 and 1 using the sigmoid function.
Got it... It is possible to speed up the predictions... Its takes quite a while to get predictions for even 10 contexts... Please suggest.
Also, why are some of the expected reward value coming as Nan for some of the arms?
@bkleyn ?
Hi @ayush488, can you provide us with a minimal subset of your data that causes this issue? If there is a np.nan
at some point within your data it might be causing this issue.
data has no Nans.. it is all binarized features... will send the data...
Also could you suggest about speed up?
here is the data that I am using.. item_ftrs.csv trainng_interactions.csv user_ftrs.csv
from mab2rec import BanditRecommender, LearningPolicy, NeighborhoodPolicy
recommenders = {
"Random": BanditRecommender(learning_policy=LearningPolicy.Random()),
"Popularity": BanditRecommender(learning_policy=LearningPolicy.Popularity()),
"LinGreedy": BanditRecommender(learning_policy=LearningPolicy.LinGreedy(epsilon=0.1)),
"LinUCB": BanditRecommender(learning_policy=LearningPolicy.LinUCB(alpha=10)),
"LinTS": BanditRecommender(learning_policy=LearningPolicy.LinTS()),
"ClustersTS": BanditRecommender(learning_policy=LearningPolicy.ThompsonSampling(),
neighborhood_policy=NeighborhoodPolicy.Clusters(n_clusters=10))
}
from jurity.recommenders import BinaryRecoMetrics, RankingRecoMetrics
# Column names for the response, user, and item id columns
metric_params = {'click_column': 'score', 'user_id_column': 'ID', 'item_id_column':'MailerID'}
# Evaluate peformance at different k-recommendations
top_k_list = [5,10,15]
# List of metrics to benchmark
metrics = []
for k in top_k_list:
metrics.append(BinaryRecoMetrics.AUC(**metric_params, k=k))
metrics.append(BinaryRecoMetrics.CTR(**metric_params, k=k))
metrics.append(RankingRecoMetrics.Precision(**metric_params, k=k))
metrics.append(RankingRecoMetrics.Recall(**metric_params, k=k))
metrics.append(RankingRecoMetrics.NDCG(**metric_params, k=k))
metrics.append(RankingRecoMetrics.MAP(**metric_params, k=k))
from mab2rec.pipeline import benchmark
# Benchmark the set of recommenders for the list of metrics
# using training data and user features scored on test data
reco_to_results, reco_to_metrics = benchmark(recommenders,
metrics=metrics,
train_data=df_train,
cv=5,
user_features=df_users_X,
item_features=df_mailers_X,
user_id_col= 'ID',
item_id_col= 'MailerID',
response_col = 'response',
batch_size =10000,
verbose=True)
I am running the above pasted code on the data I provided... but it is erroring at LinGreedy...
LinGreedy
Running...
Traceback (most recent call last):
File "C:\Users\ayush\Desktop\Rl\mabtest.py", line 393, in <module>
verbose=True)
File "C:\ProgramData\Anaconda3\envs\test_env\lib\site-packages\mab2rec\pipeline.py", line 443, in benchmark
recommendations, metrics = _bench(**args)
File "C:\ProgramData\Anaconda3\envs\test_env\lib\site-packages\mab2rec\pipeline.py", line 531, in _bench
recommendations[name])
File "C:\ProgramData\Anaconda3\envs\test_env\lib\site-packages\jurity\recommenders\combined.py", line 121, in get_score
return_extended_results)
File "C:\ProgramData\Anaconda3\envs\test_env\lib\site-packages\jurity\recommenders\auc.py", line 140, in get_score
return self._accumulate_and_return(results, batch_accumulate, return_extended_results)
File "C:\ProgramData\Anaconda3\envs\test_env\lib\site-packages\jurity\recommenders\base.py", line 121, in _accumulate_and_return
cur_result = self._get_results([results])
File "C:\ProgramData\Anaconda3\envs\test_env\lib\site-packages\jurity\recommenders\auc.py", line 146, in _get_results
return roc_auc_score(results[:, 0], results[:, 1])
File "C:\ProgramData\Anaconda3\envs\test_env\lib\site-packages\sklearn\metrics\_ranking.py", line 546, in roc_auc_score
y_score = check_array(y_score, ensure_2d=False)
File "C:\ProgramData\Anaconda3\envs\test_env\lib\site-packages\sklearn\utils\validation.py", line 800, in check_array
_assert_all_finite(array, allow_nan=force_all_finite == "allow-nan")
File "C:\ProgramData\Anaconda3\envs\test_env\lib\site-packages\sklearn\utils\validation.py", line 116, in _assert_all_finite
type_err, msg_dtype if msg_dtype is not None else X.dtype
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
i Checked none of my 3 inputs contain any Nan or Inf.
my responses are binary.
Unfortunately, I am not able to reproduce the error you are getting.
I did get another error since the training data you provided (trainng_interactions.csv) includes user IDs that do not occur in the user features (user_ftrs.csv). After subsetting the training data to only include users for which features are available, I was able to run the code below.
I did also notice that LinTS is quite slow as you mentioned elsewhere. Since LinTS requires the entire feature matrix to be inverted, I would not suggest using this algorithm when using hundreds of features.
See code and output below:
import pandas as pd
# Read data
df_train = pd.read_csv("trainng_interactions.csv")
df_users = pd.read_csv("user_ftrs.csv")
df_items = pd.read_csv("item_ftrs.csv")
# Subset train data to include only users with features
mask = df_train['ID'].isin(df_users['ID'])
df_train = df_train[mask]
from mab2rec import BanditRecommender, LearningPolicy, NeighborhoodPolicy
recommenders = {
"Random": BanditRecommender(learning_policy=LearningPolicy.Random()),
"Popularity": BanditRecommender(learning_policy=LearningPolicy.Popularity()),
"LinGreedy": BanditRecommender(learning_policy=LearningPolicy.LinGreedy(epsilon=0.1)),
"LinUCB": BanditRecommender(learning_policy=LearningPolicy.LinUCB(alpha=10)),
#"LinTS": BanditRecommender(learning_policy=LearningPolicy.LinTS()),
"ClustersTS": BanditRecommender(learning_policy=LearningPolicy.ThompsonSampling(),
neighborhood_policy=NeighborhoodPolicy.Clusters(n_clusters=10))
}
from jurity.recommenders import BinaryRecoMetrics, RankingRecoMetrics
# Column names for the response, user, and item id columns
metric_params = {'click_column': 'score', 'user_id_column': 'ID', 'item_id_column':'MailerID'}
# Evaluate peformance at different k-recommendations
top_k_list = [5,10,15]
# List of metrics to benchmark
metrics = []
for k in top_k_list:
metrics.append(BinaryRecoMetrics.AUC(**metric_params, k=k))
metrics.append(BinaryRecoMetrics.CTR(**metric_params, k=k))
metrics.append(RankingRecoMetrics.Precision(**metric_params, k=k))
metrics.append(RankingRecoMetrics.Recall(**metric_params, k=k))
metrics.append(RankingRecoMetrics.NDCG(**metric_params, k=k))
metrics.append(RankingRecoMetrics.MAP(**metric_params, k=k))
from mab2rec.pipeline import benchmark
# Benchmark the set of recommenders for the list of metrics
# using training data and user features scored on test data
reco_to_results, reco_to_metrics = benchmark(recommenders,
metrics=metrics,
train_data=df_train,
cv=5,
user_features=df_users,
item_features=df_items,
user_id_col='ID',
item_id_col='MailerID',
response_col='response',
batch_size=10000,
verbose=True)
Output:
CV Fold = 1
>>> Random
Running...
Done: 0.01 minutes
>>> Popularity
Running...
Done: 0.01 minutes
>>> LinGreedy
Running...
Done: 0.05 minutes
>>> LinUCB
Running...
Done: 0.12 minutes
>>> ClustersTS
Running...
Done: 0.01 minutes
CV Fold = 2
>>> Random
Running...
Done: 0.00 minutes
>>> Popularity
Running...
Done: 0.01 minutes
>>> LinGreedy
Running...
Done: 0.05 minutes
>>> LinUCB
Running...
Done: 0.12 minutes
>>> ClustersTS
Running...
Done: 0.01 minutes
CV Fold = 3
>>> Random
Running...
Done: 0.01 minutes
>>> Popularity
Running...
Done: 0.01 minutes
>>> LinGreedy
Running...
Done: 0.06 minutes
>>> LinUCB
Running...
Done: 0.14 minutes
>>> ClustersTS
Running...
Done: 0.01 minutes
CV Fold = 4
>>> Random
Running...
Done: 0.01 minutes
>>> Popularity
Running...
Done: 0.01 minutes
>>> LinGreedy
Running...
Done: 0.05 minutes
>>> LinUCB
Running...
Done: 0.13 minutes
>>> ClustersTS
Running...
Done: 0.02 minutes
CV Fold = 5
>>> Random
Running...
Done: 0.01 minutes
>>> Popularity
Running...
Done: 0.00 minutes
>>> LinGreedy
Running...
Done: 0.05 minutes
>>> LinUCB
Running...
Done: 0.11 minutes
>>> ClustersTS
Running...
Done: 0.01 minutes
I was not able to upload full data here... but with full data .. it is giving error after running for very long time for linepsilon.
Is there a way you can try slicing your data in different ways (ex: dividing it into 5 parts and seeing if any one of those parts is causing the error)? It would be helpful to know whether there's a specific subset of the data that's causing you issues, or whether it's only happening with the full data. If there is a specific subset you can find, we can take a look at it and see whether we can reproduce the error on our side.
will try the slicing approach now..
I there were some userid in my interaction data that were not present in the user features.. so I removed them and running the code...
what is the way to speed up the benchmark().. even for the popularity bandit is is taking a lot of time...I have about 400k interactions and 370k users and 894 items..
For Popularity, the number of items (arms) is likely the main culprit. Generally, run-time for bandit algorithms will scale linearly based on the number of items.
What is the way I can do speed up??? Regards, Ayush
On Thu, Jul 14, 2022 at 11:12 AM Bernard Kleynhans @.***> wrote:
For Popularity, the number of items (arms) is likely the main culprit. Generally, run-time for bandit algorithms will scale linearly based on the number of items.
— Reply to this email directly, view it on GitHub https://github.com/fidelity/mab2rec/issues/16#issuecomment-1184632120, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDPIESI2CK6C2SYUW6WGFLVUA4ANANCNFSM53LXBUNA . You are receiving this because you were mentioned.Message ID: @.***>
I have uploaded my full data here: https://www.dropbox.com/s/n4ruxzf82my1at9/test_interactions.zip?dl=0 I am still getting Nan error in LinUCB.. LinGreedy ran fine. If possible can you see at your end
Also, why is the output (predicted reward) coming outside (0,1) when all my training data has only 0,1 responses. Many are coming negative and many well over 1. Can you please explain..
Also, why is the output (predicted reward) coming outside (0,1) when all my training data has only 0,1 responses. Many are coming negative and many well over 1. Can you please explain..
The range of predicted expectations will depend on the selected learning policies. For example, LinUCB uses a Ridge regression to estimate rewards as a linear combination of the user contexts, meaning expectations can be outside of [0, 1] range. Thompson Sampling on that other hand samples expectations from a beta distribution, which means expectations will all be between 0 and 1.
This thread contains discussion unrelated to original issue so I am closing it.
Is it possible to get the estimated reward for a user_id and item_id from any of the bandits?