I am planning to run SFT on real chatlogs so naturally I don't have the prompt field like in the Ultrachat dataset. AFAICT, this field is not used to perform SFT so I think I can keep it as an empty string. The code that converts the datapoint into a single string only uses the messages field:
I am planning to run SFT on real chatlogs so naturally I don't have the
prompt
field like in the Ultrachat dataset. AFAICT, this field is not used to perform SFT so I think I can keep it as an empty string. The code that converts the datapoint into a single string only uses the messages field:https://github.com/huggingface/alignment-handbook/blob/61a11a5c7d66179ed0a930b0dd12e532fce701dd/src/alignment/data.py#L36C1-L42
Am I missing something here?