yusanshi / news-recommendation

Implementations of some methods in news recommendation.
MIT License
241 stars 50 forks source link

Runtime error #42

Closed shrgupta1 closed 2 years ago

shrgupta1 commented 2 years ago

Hi,

I'm using NAML with my own data instead of MIND dataset. I'm getting the following error:

candidate_news_vector = torch.stack( RuntimeError: stack expects a non-empty TensorList

yusanshi commented 2 years ago

Does the MIND dataset run well? Please check your data format. Make sure the format of your data is the same as MIND.

shrgupta1 commented 2 years ago

I didn't try the MIND dataset. I tried to keep the data format same as MIND but had to remove the subcategories, title entities and abstract entities because I didn't have that data.

shrgupta1 commented 2 years ago

Could the error be because I don't have GPU set up on my system? Upon googling the issue, I found the following solution: https://github.com/janvainer/speedyspeech/issues/18

but it wasn't very helpful since the usecase is different.

Thanks for helping! :)

yusanshi commented 2 years ago

Since stack expects a non-empty TensorList means the input to torch.stack is empty, you should check the data pipeline.

Have you modified this? https://github.com/yusanshi/news-recommendation/blob/master/src/config.py#L50 image

shrgupta1 commented 2 years ago

Yes, I did. Here's the edit:

`class NAMLConfig(BaseConfig): dataset_attributes = { "news": ['category', 'title', 'abstract'], "record": [] }

For CNN

num_filters = 300
window_size = 3`
yusanshi commented 2 years ago

I think:

shrgupta1 commented 2 years ago

I checked it and the error was coming from the id column because the 'id' column didn't have datatype int so it was causing the error. I removed the 'id' column and that error is gone.

shrgupta1 commented 2 years ago

After removing the 'id' column, when I run train.py with python 3.9, I'm getting the following error:

RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/data/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/data/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch return self.collate_fn(data) File "/data/python3.9/site-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate return {key: default_collate([d[key] for d in batch]) for key in elem} File "/data/python3.9/site-packages/torch/utils/data/_utils/collate.py", line 74, in <dictcomp> return {key: default_collate([d[key] for d in batch]) for key in elem} File "/data/python3.9/site-packages/torch/utils/data/_utils/collate.py", line 82, in default_collate raise RuntimeError('each element in list of batch should be of equal size') RuntimeError: each element in list of batch should be of equal size

yusanshi commented 2 years ago

Please first try the MIND dataset

yusanshi commented 2 years ago

Close inactive issue. Reopen it if needed.