LaVieEnRose365 / ReLLa

Code of Paper "ReLLa: Retrieval-enhanced Large Language Models for Mitigating Long Context Problems in Recommendation".
41 stars 4 forks source link

More questions about the implementation #9

Closed PipiZong closed 5 months ago

PipiZong commented 5 months ago
          Thanks for your reply. Sorry I am still confused about the second question, should we provide adapter_model.bin or pytorch_model.bin as the model path in inference.py?

And I have one more question, I downloaded your processed data and found the train data size for ml-1m and bookcrossing is around 7w and 1.5w. However, in your paper, as you mentioned in Table 2, the train size is 256/1024 on bookcrossing while 8192/65536 on ml-1m. Does this mean you sampled 256/1024 from 1.5w?

Thanks!

Originally posted by @PipiZong in https://github.com/LaVieEnRose365/ReLLa/issues/7#issuecomment-2167179716

LaVieEnRose365 commented 5 months ago

Yes. As is mentioned in detail in our paper, the size of whole training set for ml-1m is about 90w, which is very time-consuming to finetune LLMs on a single V100 card, so we downsample the training set for LLMs randomly to 7w. The K shot samples are just the first K samples in the sampled training set. In this way, the 256-shot samples will be included in the 1024-shot samples, which ensures rationality in evaluation.