Closed ShadowTinker closed 1 year ago
Besides, I also find some bugs in loading saved dataloaders by setting save_dataloaders=True
in the config file. In the following code, I find eval_neg_sample_args does not exist in the config while valid_neg_sample_args or test_neg_sample_args exists,
https://github.com/RUCAIBox/RecBole/blob/edf2e6cb8190241b6bb16d72a2ad8f2ff004845f/recbole/data/dataloader/general_dataloader.py#L140-L144
, which will result in a key error in line139 in the following code:
https://github.com/RUCAIBox/RecBole/blob/edf2e6cb8190241b6bb16d72a2ad8f2ff004845f/recbole/data/dataloader/abstract_dataloader.py#L132-L142
I opened a pull request #1698 to fix the two problems mentioned above.
Hello @ShadowTinker,
Thank you for your attention and contributions to RecBole! You are correct that there are some bugs. To improve the rationality of the code structure and convenience of the changes, I have made some minor modifications. Once you have reviewed the modifications in https://github.com/RUCAIBox/RecBole/pull/1698, the updates can be merged into the main code.
Thank you again for your support to our team!
Hello @Paitesanshi,
Thank you for your reply! But I just find another problem, which is that the saved dataloader will still be loaded when modifying train_batch_size
or eval_batch_size
, while the batch_size is not changed correspondingly. And I find it is caused by the following lines:
https://github.com/RUCAIBox/RecBole/blob/edf2e6cb8190241b6bb16d72a2ad8f2ff004845f/recbole/data/utils.py#L130-L132
where train_batch_size
or eval_batch_size
is not checked.
But I'm not sure whether it is designed on purpose, because these arguments usually are fixed. So I wonder if it is needed to add the two args to the above lines.
@ShadowTinker We think that user loading existing data_loader
means using fixed batch_size. We will consider whether to modify this setting after collecting more user feedback. Thanks for your suggestion again!
@Paitesanshi Thank you for the explanation. I've reviewed the modifications in #1698 and It will be my honor to contribute to recbole. Thank you again for the great repo and contribution to the community.
This fails again when this is run in multiprocessing with 4GPUs on 1 node. One process writes to disk while another process thinks the file exists and tries to read. Even if this doesnt happen since 4 processes write to disk in a pickle file with the mode "wb", the file is going to be corrupt on loading it later for inference.
Describe the bug I set
save_dataset=True
inrecbole/properties/overall.yaml
but find the model re-processes the dataset every time.To Reproduce Steps to reproduce the behavior:
save_dataset=True
inrecbole/properties/overall.yaml
python run_recbole.py --model=DIN --dataset=ml-100k
. And it works for DIN, AFM, BPR and SASRec.
Desktop (please complete the following information):