How is the custom dataset loaded for fine-tuning

Thanks for providing the script for reading the data into a dictionary. Could you please provide some extension showing how this dictionary is actually loaded as a torch dataset that be used for fine-tuning the base LLM? Have you defined a custom dataset class for this? Do you define a single dataloader iterating over the full concatenation of datasets, or separate dataloaders per dataset ? How is data serialisation implemented?

ZhangTP1996 / TapTap

How is the custom dataset loaded for fine-tuning #3