Open shengyangs opened 7 months ago
@gshennvm @odelalleau How do you think?
@gshennvm @odelalleau How do you think?
I agree we should have one.
there is a SFT section in the RLHF tutorial, we can pull that out -- will that do?
also @shengyangs do you happen to have a better dataset/script for the SFT tutorial? I wonder if we should update to something other than dolly
@gshennvm You are right. The SFT is here already. I think we probably want to pull it out in a separate section. I missed it in the first read. I was talking with some people, and they are interested in trying out SFT due to its simplicity.
The current dolly dataset is fine to me as a prompt-response example. Maybe we should add another example with a chat dataset, for it I have been playing with Ultrachat. I am not sure if there are simpler toy datasets.
Is your feature request related to a problem? Please describe.
We should include a tutorial for the SFT. Although we have SteerLM, including a SFT tutorial is important because it is the simplest technique for a user to get started. It is also prerequisite of RLHF and DPO.