Open Rhriti opened 3 months ago
Hi, I've asked on the discord too, but where exactly did you saw the second format? It should be more like the first one.
this is after reformatting the jsonl file! can someone explain what is 'prompt' and 'prompt_id' doing in there
@Rhriti do you still have this issue? The script for reformatting should work, is it possible to have a copy of your notebook?
@pandora-s-git notebook is the same as attached in the guide. Just run it yourself
Ive taken a look, and its an artifact from the original dataset, the 'messages' is the important part, so the required is ur first format. We may fix that later to make it clear, thanks for the report !
I've been diving into fine-tuning for this virtual hackathon but I am confused about the format used to fine-tune the model using its apis.Below is the format used in the 'capabilities' section under 'fine-tuning' in the docs. { "messages": [ { "role": "user", "content": "User interaction n°1 contained in document n°2" }, { "role": "assistant", "content": "Bot interaction n°1 contained in document n°2" }, { "role": "user", "content": "User interaction n°2 contained in document n°1" }, { "role": "assistant", "content": "Bot interaction n°2 contained in document n°1" } ] }
But under the 'Guide' section 'fine-tuning' a different format is used and later converted into binary and fed to the api.here is the format: { "prompt":"Are there any specific neighborhoods in the city that are particularly well-served by public transportation?", "prompt_id":"ea9a0c38af7a8f9602c40a2e666283e8bb5101adec34421ec4a7df90bb081666", "messages":[ { "content":"Are there any specific neighborhoods in the city that are particularly well-served by public transportation?", "role":"user" }] , enlighten me!!