test data in inference.py and finetune.py

LaVieEnRose365 / ReLLa

Code of Paper "ReLLa: Retrieval-enhanced Large Language Models for Mitigating Long Context Problems in Recommendation".

41 stars 4 forks source link

test data in inference.py and finetune.py #7

Closed PipiZong closed 5 months ago

PipiZong commented 5 months ago

Hi, thanks for sharing your work. I have two questions about your code:

Just wondering if test data in inference.py and finetune.py is same? I guess test data acts as a validation set to select the best model in finetune.py, so they should not be same? But in the code, it seems from the same test file.
Could you help confirm if I run the code correctly? I first ran finetune.py where the model path is specified as the original vicuna model to get the finetuned model. Then I ran inference.py with the model path set as the above finetuned model (pytorch_model.bin or adapter_model.bin?) to get the test results.

If you can provide the model you have in the drive folder and complete the command line with the provided model name, that would be easier to reproduce. Thanks

LaVieEnRose365 commented 5 months ago

Hi, there. Thanks for your support of our work. I will try my best to answer your questions:

First question: as we report in our paper, we only train the model for one epoch, while the max epochs is larger than one. The reason behind is that finetuning of LLMs is time-consuming and we find that one epoch is enough to promise a good result, the max epoch is larger than 1 to get appropriate learning rate (otherwise the lr will be very low for the last few steps and the model will not learn the samples well). So actually we didn't use the test data in finetune.py and use no evaluation set, but we include it for the sake of comprehensiveness. If you have enough computation resources, you can choose to use the evaluation set and specify it in the file.
Second question: the process seems to have no problem. You are free to open new issues if you encounter more problems.

PipiZong commented 5 months ago

Thanks for your reply. Sorry I am still confused about the second question, should we provide adapter_model.bin or pytorch_model.bin as the model path in inference.py?

And I have one more question, I downloaded your processed data and found the train data size for ml-1m and bookcrossing is around 7w and 1.5w. However, in your paper, as you mentioned in Table 2, the train size is 256/1024 on bookcrossing while 8192/65536 on ml-1m. Does this mean you sampled 256/1024 from 1.5w?

Thanks!

LaVieEnRose365 commented 5 months ago

The 'model_path' is the path to your LLM weight, while the 'resume_from_checkpoint' is the path to the lora adapter weight. In finetune.py, the saved 'adapter_model.bin' will be renamed as 'pytorch_model.bin' for convenience, but they are the same thing. You can use them in the way you like if you are sure that it's the lora adapter weight. By the way, if you only want to test the zero-shot capability, then the lora weight is not needed.

PipiZong commented 5 months ago

Thank you for your reply. So we need to set use_lora to 1 for inference to reproduce your results? In the readme, use_lora seems to be 0 in your comment line for inference.