[FEATURE REQUEST] batch prediction

maciejkula / spotlight

Deep recommender models using PyTorch.

MIT License

2.99k stars 423 forks source link

[FEATURE REQUEST] batch prediction #136

Closed amirj closed 5 years ago

amirj commented 6 years ago

I have a dataset contains 5,115,123 training samples and 1,278,781 test samples (train/test split ratio default=0.2). It takes some minutes to train each model on GPU (100% Utilization) but when I run the following command: mrr_baseline = mrr_score(model_baseline, dataset_test).mean() It takes hours! and GPU utilizes for only 20%. Why? Do you have any suggestion to do the prediction faster?

kevglynn commented 6 years ago

Having the same issue on certain models (e.g. bilinear neural network model).

amirj commented 5 years ago

The problem is that model.predict(user_id) for each user is time-consuming check the source code. @maciejkula Is there any way to get all item predictions for all users faster? I mean, instead of iterating in a loop and get predictions for each user, do it for all users at one stage. Current API doesn't let this type of predictions, if user_ids is an array, then it should be matched with item_ids.

maciejkula commented 5 years ago

This issue is more pronounced for the bilinear models because they do very little computation per user. For this and many other reasons I strongly recommend using the sequence models.

@amirj the easiest solution for you is to write your own batched predict implementation.