massquantity / LibRecommender

Versatile End-to-End Recommender System
https://librecommender.readthedocs.io/
MIT License
363 stars 64 forks source link

Problem installation #30

Open jselma opened 3 years ago

jselma commented 3 years ago

Hi @massquantity great library!!!

Please your help, I have the following error (D8021) when installing version 0.2.0 (pip install LibRecommender==0.2.0):

**building 'libreco.algorithms._bpr' extension creating build\temp.win-amd64-3.7 creating build\temp.win-amd64-3.7\Release creating build\temp.win-amd64-3.7\Release\libreco creating build\temp.win-amd64-3.7\Release\libreco\algorithms C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.28.29333\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -Ic:\users\jselma\anaconda3\envs\inteldistribution\lib\site-packages\numpy\core\include -Ic:\users\jselma\anaconda3\envs\inteldistribution\include -Ic:\users\jselma\anaconda3\envs\inteldistribution\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.28.29333\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -IC:\Users\jselma\anaconda3\envs\intelDistribution\Library\include /EHsc /Tplibreco\algorithms_bpr.cpp /Fobuild\temp.win-amd64-3.7\Release\libreco\algorithms_bpr.obj -Wno-unused-function -Wno-maybe-uninitialized -O3 -ffast-math -fopenmp -std=c++11 cl : L¡nea de comandos error D8021 : argumento num‚rico no v lido '/Wno-unused-function' error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.28.29333\bin\HostX86\x64\cl.exe' failed with exit status 2

ERROR: Failed building wheel for LibRecommender Running setup.py clean for LibRecommender Failed to build LibRecommender Installing collected packages: LibRecommender Running setup.py install for LibRecommender ... error**

massquantity commented 3 years ago

See this issue. Possible cause is that your system is Windows... You can clone all the code and cd into the LibRecommender folder. Then maybe you can use the library directly.

jselma commented 3 years ago

See this issue. Possible cause is that your system is Windows... You can clone all the code and cd into the LibRecommender folder. Then maybe you can use the library directly.

Hi @massquantity , thanks for your quick reply. I have two questions, I hope you can help me please. What folder should I copy? and In which folder should I paste the folder?

image

image

jselma commented 3 years ago

Could you correct the "setup.py" file to make it compatible with Windows?

image image

massquantity commented 3 years ago

Alright, I've released a new version, so you can give it a try.

pip install LibRecommender==0.2.2

jselma commented 3 years ago

Hi @massquantity, I could use the library by copying it into the python environment folder, however it does not load ALS Cython. The other algorithms like SVD ++ work without problem.

How can I use ALS?

image

massquantity commented 3 years ago

ALS is implemented by Cython, and SVD++ is implemented by TensorFlow. When you're coping the library, the ALS Cython version doesn't build. To use ALS, you have to install from pip.

jselma commented 3 years ago

Alright, I've released a new version, so you can give it a try.

pip install LibRecommender==0.2.2

The installation works perfect !!, thank you very much !!!!!!

jselma commented 3 years ago

ALS is implemented by Cython, and SVD++ is implemented by TensorFlow. When you're coping the library, the ALS Cython version doesn't build. To use ALS, you have to install from pip.

Hi @massquantity... It works and trains perfect with ALS. Great! work, thanks.

How do you order users and trained items for predictions? I imagine the function split_by_ratio or DatasetPure.build_trainset generates its own indexes.

User 100444 exists in my dataset, but cannot be found. image

User 12923 works, but I don't know which one is from my dataset. image

jselma commented 3 years ago

Hi @massquantity. The value of p@t seems very strange to me, understanding that I am providing 300 factors and an alpha of 40. It indicates a precision of less than 1 relevant item out of 10 recommended items in contrast to m@p (0.2192)

My dataset has around 1,500,000 observations. In the column "label" there are values ​​between 1 and 20 that represent the number of reproductions per item, in addition to having a sparsity of 8%.

There are 40,000 users and 500 movies.

What am I doing wrong?

image

image

massquantity commented 3 years ago

Someone has encountered the same problem... See this issue

How do you order users and trained items for predictions? I imagine the function split_by_ratio or DatasetPure.build_trainset generates its own indexes.

User 100444 exists in my dataset, but cannot be found.

User 12923 works, but I don't know which one is from my dataset.

massquantity commented 3 years ago

My dataset has around 1,500,000 observations. In the column "label" there are values ​​between 1 and 20 that represent the number of reproductions per item, in addition to having a sparsity of 8%.

Sorry I don't understand what do you mean by "number of reproductions per item" ? Could you show a snapshot of your data? Since you are using task="ranking", I assume your data is implicit. In that case the label should either be 0 or 1.

jselma commented 3 years ago

My dataset has around 1,500,000 observations. In the column "label" there are values ​​between 1 and 20 that represent the number of reproductions per item, in addition to having a sparsity of 8%.

Sorry I don't understand what do you mean by "number of reproductions per item" ? Could you show a snapshot of your data? Since you are using task="ranking", I assume your data is implicit. In that case the label should either be 0 or 1.

Hi @massquantity thank you.

It is implicit information (the data looks like the image attached), but I need to use the number of movie plays for the confidence parameter (Cui = 1 + alpha * Rui)

image

image

jselma commented 3 years ago

Someone has encountered the same problem... See this issue

How do you order users and trained items for predictions? I imagine the function split_by_ratio or DatasetPure.build_trainset generates its own indexes. User 100444 exists in my dataset, but cannot be found. User 12923 works, but I don't know which one is from my dataset.

Thank you!!. It works for me with the training data, not the evaluation data because I don't have the information from the data. This line is not supported eval_data, data_info_eval = DatasetPure.build_evalset (eval_data)

massquantity commented 3 years ago

The value of p@t seems very strange to me, understanding that I am providing 300 factors and an alpha of 40. It indicates a precision of less than 1 relevant item out of 10 recommended items in contrast to m@p (0.2192)

My dataset has around 1,500,000 observations. In the column "label" there are values ​​between 1 and 20 that represent the number of reproductions per item, in addition to having a sparsity of 8%.

There are 40,000 users and 500 movies.

What am I doing wrong?

I see. Well you are doing nothing wrong here, since it is possible that the map value could be much higher than precision. Imagine this scenario, I get a recommendation list of 10, and only one item is predicted correct:

0 0 0 1 0 0 0 0 0 0

Then the precision@10 will be 1/10 = 0.1, and the map@10 will be 1/4 = 0.25. That's why map can be thought as a ranking metrics. Different positions will result in different map, even if the length of recommendation list is the same.

massquantity commented 3 years ago

Thank you!!. It works for me with the training data, not the evaluation data because I don't have the information from the data. This line is not supported eval_data, data_info_eval = DatasetPure.build_evalset (eval_data)

The id-mapping is same for both train and eval data. If you can't get a user's id from the id-mapping in the eval data, that's because some user may only appear in eval data and didn't appear in train data. This is basically a cold-start problem, and this library can't deal with that right now.

jselma commented 3 years ago

The value of p@t seems very strange to me, understanding that I am providing 300 factors and an alpha of 40. It indicates a precision of less than 1 relevant item out of 10 recommended items in contrast to m@p (0.2192) My dataset has around 1,500,000 observations. In the column "label" there are values ​​between 1 and 20 that represent the number of reproductions per item, in addition to having a sparsity of 8%. There are 40,000 users and 500 movies. What am I doing wrong?

I see. Well you are doing nothing wrong here, since it is possible that the map value could be much higher than precision. Imagine this scenario, I get a recommendation list of 10, and only one item is predicted correct:

0 0 0 1 0 0 0 0 0 0

Then the precision@10 will be 1/10 = 0.1, and the map@10 will be 1/4 = 0.25. That's why map can be thought as a ranking metrics. Different positions will result in different map, even if the length of recommendation list is the same.

Ok Thank you,

The value of p@t seems very strange to me, understanding that I am providing 300 factors and an alpha of 40. It indicates a precision of less than 1 relevant item out of 10 recommended items in contrast to m@p (0.2192) My dataset has around 1,500,000 observations. In the column "label" there are values ​​between 1 and 20 that represent the number of reproductions per item, in addition to having a sparsity of 8%. There are 40,000 users and 500 movies. What am I doing wrong?

I see. Well you are doing nothing wrong here, since it is possible that the map value could be much higher than precision. Imagine this scenario, I get a recommendation list of 10, and only one item is predicted correct:

0 0 0 1 0 0 0 0 0 0

Then the precision@10 will be 1/10 = 0.1, and the map@10 will be 1/4 = 0.25. That's why map can be thought as a ranking metrics. Different positions will result in different map, even if the length of recommendation list is the same.

Thanks, the strange thing is that if I modify the parameters, the p@k does not increase more than 0.20. At some point this should increase until it produces an overfitting of the model.

Should I normalize the "plays" column, and define from what value the interaction which is relevant? (1 or 0 for p). In this case, relevant are plays greater than 1. Sholud I set it in the line train_data.build_negative_samples (data_info, item_gen_mode = "random", num_neg = 1, seed = 2020)? Or should I modify 0 the values ​​that are not greater than or equal to 1 directly in the initial dataset?

Thank you.

massquantity commented 3 years ago

I'm gonna tell you, man, you've touched a very subtle point in this library:) As you may have already known, ALS is quite different from other algorithms such as SVD in the library , because it leverages all the possible user-item pairs to train the model. This becomes problematic when we evaluate the model.

Since due to the computational constraint, we can't use all the user-item pairs to evaluate, so I use the same procedure as the other algorithms do, i.e. treating all the samples in the original data as positive samples and labeled them as 1, then sampling some data as negative samples which are labeled as 0. The advantage is that it makes the comparison of different algorithms easy. But the problem is, it can't reveal the frequency (i.e. "plays" column in your data) information, whereas during training time, we do take it into account. This makes the goal of training and evaluating diverge.

Maybe this explains the weird situation you've encountered on precision@k. If all you want is increasing precision, then try decreasing alpha. Because when alpha reaches 0, all positive samples will have confidence of 1, which makes it more or less like a binary classification problem, and precision is suitable for it.

jselma commented 3 years ago

I'm gonna tell you, man, you've touched a very subtle point in this library:) As you may have already known, ALS is quite different from other algorithms such as SVD in the library , because it leverages all the possible user-item pairs to train the model. This becomes problematic when we evaluate the model.

Since due to the computational constraint, we can't use all the user-item pairs to evaluate, so I use the same procedure as the other algorithms do, i.e. treating all the samples in the original data as positive samples and labeled them as 1, then sampling some data as negative samples which are labeled as 0. The advantage is that it makes the comparison of different algorithms easy. But the problem is, it can't reveal the frequency (i.e. "plays" column in your data) information, whereas during training time, we do take it into account. This makes the goal of training and evaluating diverge.

Maybe this explains the weird situation you've encountered on precision@k. If all you want is increasing precision, then try decreasing alpha. Because when alpha reaches 0, all positive samples will have confidence of 1, which makes it more or less like a binary classification problem, and precision is suitable for it.

Hi @massquantity , so to indicate that values ​​less than or equal to 1 are not relevant (if Rui<=1 then p=0 and if Rui>1 then p=1), Should I delete in the original dataset the rows where the value of "plays" is not greater than 1?.

Another question, how should I interpret if I assign the value 1 or 5 to the num_neg parameter. What is the impact with train_data.build_negative_samples (data_info, item_gen_mode = "random", num_neg = 1, seed = 2020) or train_data.build_negative_samples (data_info, item_gen_mode = "random", num_neg = 5, seed = 2020)?

Thanks!!!!!

massquantity commented 3 years ago

You are right. You can delete these rows if you want to treat them as irrelevant instances.

The num_neg parameter means how many negative instances do you want to sample for one positive instance. Suppose you have 1 million instances in the original data, and num_neg = 5, then you'll get 5 million negative instances.

jselma commented 3 years ago

You are right. You can delete these rows if you want to treat them as irrelevant instances.

The num_neg parameter means how many negative instances do you want to sample for one positive instance. Suppose you have 1 million instances in the original data, and num_neg = 5, then you'll get 5 million negative instances.

Thanks!!! @massquantity. I can not increase the precision. I have modified the hyperparameters (alpha, factors and regularization), even with alpha = 0, for binary classification, but I cannot increase the metric. What am I doing wrong?. (This is my code https://github.com/jselma/test/blob/main/libreco_test.py)

Is the test always for 52 cases? (See attached image)

image

massquantity commented 3 years ago

Oh sorry... How can I forget this ! The evaluation process typically needs to compute recommendations for every user, which becomes very slow when total number of users is very large, especially if one is using the sophisticated deep learning models. So I adopted a small trick in evaluating, i.e. only sampling a portion of users and compute recommendations for them.

The number 52 in your picture is eval batch size. One batch contains 8192 samples, so total eval data size is about 425 thousand. The number 2048 below is the sampled eval user size I mentioned above, and I think you can a get better result if you use more users to evaluate.

This problem didn't occur to me when I was writing the library, so I didn't provide an argument which can change the sample user size. But you can change the source code directly, and it's in libreco/evaluate/evaluate.py, line 101:

    def print_metrics(self, train_data=None, eval_data=None, metrics=None,
                      eval_batch_size=8192, k=10, sample_user_num=2048,
                      **kwargs):

You can change the sample_user_num argument to the number you want, but don't exceed the number of users in the original data. The k argument means the k of precision@k, which is also tunable.

Also, if you just want to evaluate on the test data, you can call evaluate directly:

print(als.evaluate(test_data, metrics=["precision"], sample_user_num=1000))
jselma commented 3 years ago

Oh sorry... How can I forget this ! The evaluation process typically needs to compute recommendations for every user, which becomes very slow when total number of users is very large, especially if one is using the sophisticated deep learning models. So I adopted a small trick in evaluating, i.e. only sampling a portion of users and compute recommendations for them.

The number 52 in your picture is eval batch size. One batch contains 8192 samples, so total eval data size is about 425 thousand. The number 2048 below is the sampled eval user size I mentioned above, and I think you can a get better result if you use more users to evaluate.

This problem didn't occur to me when I was writing the library, so I didn't provide an argument which can change the sample user size. But you can change the source code directly, and it's in libreco/evaluate/evaluate.py, line 101:

    def print_metrics(self, train_data=None, eval_data=None, metrics=None,
                      eval_batch_size=8192, k=10, sample_user_num=2048,
                      **kwargs):

You can change the sample_user_num argument to the number you want, but don't exceed the number of users in the original data. The k argument means the k of precision@k, which is also tunable.

Also, if you just want to evaluate on the test data, you can call evaluate directly:

print(als.evaluate(test_data, metrics=["precision"], sample_user_num=1000))

@massquantity Thank you very much!!!!! for your help and patience.

massquantity commented 3 years ago

You're welcome. In fact your questions helped me improve the library too:)

jselma commented 3 years ago

It is an excellent!!!!! library, it was very!! useful to me.

Do you have a method to retrieve similar items and users?

massquantity commented 3 years ago

I don't have such method. But you can directly get all user and item embeddings (which is numpy array format), and perform some similarity metrics such as dot product.


>>> user_embed = als.user_embed
>>> item_embed = als.item_embed
jselma commented 3 years ago

I don't have such method. But you can directly get all user and item embeddings (which is numpy array format), and perform some similarity metrics such as dot product.

>>> user_embed = als.user_embed
>>> item_embed = als.item_embed

Thank!!!!!. I have a new question, what confidence method does the algorithm use, 1 or 2? (See the attached image).

My best regularization is 150, I think it is very high. Should I have scaled the rating data before, or does the algorithm use method 2?, and that is simply the best regularization value?.

Thanks!

image

massquantity commented 3 years ago

The algorithm uses the first one. I don't know if it is the best. Maybe your data is very sparse, and every user only have a few interactions, so the regularization will be high to avoid overfitting.

jselma commented 3 years ago

The algorithm uses the first one. I don't know if it is the best. Maybe your data is very sparse, and every user only have a few interactions, so the regularization will be high to avoid overfitting.

Thanks!!!!!

jselma commented 3 years ago

Hi, Is the method that explains the recommendations available?.

image

massquantity commented 3 years ago

No, this library doesn't deal with the explanation problem...