yusanshi / news-recommendation

Implementations of some methods in news recommendation.
MIT License
241 stars 50 forks source link

Error while Evaluating #29

Closed Izorar closed 2 years ago

Izorar commented 2 years ago

Hello @yusanshi, There is an issue with indexing in the evaluate file precisely line 262:

  File "src/evaluate.py", line 262, in <listcomp>
    int(news[0].split('-')[1]) for news in minibatch['impressions']
IndexError: list index out of range

This issue happens on MIND large dataset Thanks

yusanshi commented 2 years ago

That's because the test set of the MIND large dataset has no labels, so the format of the test file is not correct. See https://github.com/yusanshi/news-recommendation/issues/11 and https://github.com/msnews/MIND/issues/8.

... and it seems that there is nothing we can do :cry:

Izorar commented 2 years ago

@yusanshi. Does it mean the labels were not released by the authors of dataset/task?

Izorar commented 2 years ago

And if the case is that it was not released? How do we evaluate? Thanks

yusanshi commented 2 years ago

@yusanshi. Does it mean the labels were not released by the authors of dataset/task?

Exactly.


And if the case is that it was not released? How do we evaluate? Thanks

Please see https://msnews.github.io/ and https://competitions.codalab.org/competitions/24122#participate. Basically you need to upload the inference result to the online evaluation platform. To generate the evaluation results in required file format, you should make some changes to the code. In the earlier version of the repo there are some code that may be helpful: https://github.com/yusanshi/news-recommendation/blob/fe11cef92682c1030eed35489e25cce10cffa5f3/src/evaluate.py#L259-L262

Izorar commented 2 years ago

Thanks. I am going to try that out

Izorar commented 2 years ago

There is still an issue with the model

Traceback (most recent call last):
  File "src/evaluate.py", line 289, in <module>
    './data/test/prediction.txt')
  File "/home/izorar/.local/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "src/evaluate.py", line 185, in evaluate
    news_vector = model.get_news_vector(minibatch)
  File "/home/izorar/news-recommendation/src/model/TANR/__init__.py", line 82, in get_news_vector
    return self.news_encoder(news)
  File "/home/izorar/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/izorar/news-recommendation/src/model/TANR/news_encoder.py", line 40, in forward
    title_vector = F.dropout(self.word_embedding(news['title'].to(device)),
AttributeError: 'list' object has no attribute 'to'
yusanshi commented 2 years ago

Looks like you're using the old version code? Please use the latest code, i.e., the newest commit in master branch. Or I don't know which code you're using so I can't give any suggestions.

yusanshi commented 2 years ago

AttributeError: 'list' object has no attribute 'to', so the news_encoder.py is assuming that news['title'] is a torch tensor, instead of a python list. The conversion from list to tensor should have been done in other files (dataset.py). So one possible reason is that there're some bugs with the old version dataset.py. So please git pull to apply all the changes. If this happens again, we can investigate it furtherly.

Izorar commented 2 years ago

Thanks @yusanshi. I have done that already. I am working on writing the results to file as suggested by you and according to the standard specified by the organizers but I seem to be lost. The problem is still the issue of label, precisely this line:

            y_list = [
                int(news[0].split('-')[1]) for news in minibatch['impressions']
            ]

in the evaluate file

yusanshi commented 2 years ago

Since we have no labels so the following code makes no sense: https://github.com/yusanshi/news-recommendation/blob/fe11cef92682c1030eed35489e25cce10cffa5f3/src/evaluate.py#L244-L257

Simply removing them should be OK.