takuti / flurs

:ocean: FluRS: A Python library for streaming recommendation algorithms
https://flurs.readthedocs.io/
MIT License
108 stars 17 forks source link

TypeError: no supported conversion for types: (dtype('<U32'),) #10

Closed msramalho closed 2 years ago

msramalho commented 5 years ago

I have a list of a dataset of user,item,context that I have mapped to a list of Event with

Event(User(e.user), Item(e.item), context=np.array([e.context]))

I am using a FM but I don't know what is p here so I tried the number of possible values for the context column:

recommender = FMRecommender(p=count_context_column_values)

I do:

recommender.initialize()

and then try to use it in an Evaluator:

evaluator = Evaluator(recommender, repeat=False)
for x in evaluator.evaluate(events_list):
    print(x)

But I always get the error:

TypeError: no supported conversion for types: (dtype('<U32'),)

From line 126 of the score method: u_mat = sp.csr_matrix(np.repeat(u_vec, n_target, axis=1))

Questions

  1. why is this happening, how to fix?
  2. what should I use for p
  3. repeat=False in the evaluator means that not to consider repeated entries in the dataset??

Thank you for your help and time.

P.S.: I've tried without Evaluator and by calling the recommender.recommend function myself and it either complained about index out of range in the i_mat or, when I passed candidates with an empty or with a single 0 value it gave the same error as described above.

msramalho commented 5 years ago

So I've found the problem to be that I was passing strings instead of integers.

However, a new error now occurs:

IndexError: index (1) out of range

this occurs in the i_mat and is the same error I mentioned previously..

takuti commented 5 years ago

p is the number of dimensions of an input vector. That is, p equals to len(context).

repeat tells an evaluator if the same item can be repeatedly interacted by a user. If it's False, recommender in the evaluator does NOT recommend the same item more than twice to a user. While repeat=True fits to many realistic scenario like e-commerce, there are some exceptions like MovieLens data, which does not contain multiple ratings for a user-item pair.

For the out of range error, can I ask you to make sure if Event.context is 1d array? When I see your following code, I guess e.context is already an array, and Event.context ends up to 2d array due to np.array([e.context]).

Event(User(e.user), Item(e.item), context=np.array([e.context]))

If this is the case, you can alternatively do: context=np.array(e.context)

msramalho commented 5 years ago

Het @takuti many thanks for the help, I've managed to achieve this by not using context and using user_features instead and also by setting the use_index=True in the recommender object.

recommender.initialize(use_index=True)

Although, I also don't understand what it does...?

takuti commented 2 years ago

@msramalho Sorry for not providing a response after you came up with a solution. Just giving a heads up - #12 fixes your issue reported here with a proper docstring that answers to your question.