This is a promising study and thanks for being transparent about code and weights. One thing I find unclear though, and I would be really appreciated to receive an answer:
According to the article, the training of LLM was conducted in two tuning stages: instruct-tuning and rec-tuning. I wonder about the transition between these two models, especially how you continued after instruct-tuning. Did you basically do rec-tuning on top of instruct-tuning (lora trained Alpaca model)? I mean did you basically fine-tune on recommendation task based on an already fine-tuned alpaca model?
If so, did you use again the alpaca-lora framework (just like in the instruct-tuning stage) to do the second training?
Hello,
This is a promising study and thanks for being transparent about code and weights. One thing I find unclear though, and I would be really appreciated to receive an answer:
According to the article, the training of LLM was conducted in two tuning stages: instruct-tuning and rec-tuning. I wonder about the transition between these two models, especially how you continued after instruct-tuning. Did you basically do rec-tuning on top of instruct-tuning (lora trained Alpaca model)? I mean did you basically fine-tune on recommendation task based on an already fine-tuned alpaca model?
If so, did you use again the alpaca-lora framework (just like in the instruct-tuning stage) to do the second training?
Thanks in advance for your time and effort