Closed Izorar closed 2 years ago
That's because the test set of the MIND large dataset has no labels, so the format of the test file is not correct. See https://github.com/yusanshi/news-recommendation/issues/11 and https://github.com/msnews/MIND/issues/8.
... and it seems that there is nothing we can do :cry:
@yusanshi. Does it mean the labels were not released by the authors of dataset/task?
And if the case is that it was not released? How do we evaluate? Thanks
@yusanshi. Does it mean the labels were not released by the authors of dataset/task?
Exactly.
And if the case is that it was not released? How do we evaluate? Thanks
Please see https://msnews.github.io/ and https://competitions.codalab.org/competitions/24122#participate. Basically you need to upload the inference result to the online evaluation platform. To generate the evaluation results in required file format, you should make some changes to the code. In the earlier version of the repo there are some code that may be helpful: https://github.com/yusanshi/news-recommendation/blob/fe11cef92682c1030eed35489e25cce10cffa5f3/src/evaluate.py#L259-L262
Thanks. I am going to try that out
There is still an issue with the model
Traceback (most recent call last):
File "src/evaluate.py", line 289, in <module>
'./data/test/prediction.txt')
File "/home/izorar/.local/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "src/evaluate.py", line 185, in evaluate
news_vector = model.get_news_vector(minibatch)
File "/home/izorar/news-recommendation/src/model/TANR/__init__.py", line 82, in get_news_vector
return self.news_encoder(news)
File "/home/izorar/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/izorar/news-recommendation/src/model/TANR/news_encoder.py", line 40, in forward
title_vector = F.dropout(self.word_embedding(news['title'].to(device)),
AttributeError: 'list' object has no attribute 'to'
Looks like you're using the old version code? Please use the latest code, i.e., the newest commit in master branch. Or I don't know which code you're using so I can't give any suggestions.
AttributeError: 'list' object has no attribute 'to'
, so the news_encoder.py
is assuming that news['title']
is a torch tensor, instead of a python list. The conversion from list to tensor should have been done in other files (dataset.py
). So one possible reason is that there're some bugs with the old version dataset.py
. So please git pull
to apply all the changes. If this happens again, we can investigate it furtherly.
Thanks @yusanshi. I have done that already. I am working on writing the results to file as suggested by you and according to the standard specified by the organizers but I seem to be lost. The problem is still the issue of label, precisely this line:
y_list = [
int(news[0].split('-')[1]) for news in minibatch['impressions']
]
in the evaluate file
Since we have no labels so the following code makes no sense: https://github.com/yusanshi/news-recommendation/blob/fe11cef92682c1030eed35489e25cce10cffa5f3/src/evaluate.py#L244-L257
Simply removing them should be OK.
Hello @yusanshi, There is an issue with indexing in the evaluate file precisely line 262:
This issue happens on MIND large dataset Thanks