Closed luzhaoyan closed 3 months ago
To train the model with the UltraEdit dataset, set dataset_name=BleachNick/UltraEdit
and pass it as an argument to the Python code. There's no need to set train_data_jsonl as an argument.
To train the model with your own dataset, provide the path to the JSONL file in the train_data_jsonl
argument. Each item in the JSONL should have the following keys:
{
"source_image": path to the source image,
"edited_image": path to the edited ground truth image,
"edit_prompt": the edit instruction,
"mask_image": path to the mask image used in stage 2 training. For free-form image editing, set "mask_image" to "NONE" and a blank mask will be generated by default.
}
Before training with stable-diffusion-xl, should I change the “train_data_jsonl” in the file scripts/run_sft_512_sdxl_stage1.sh? When i load the UltraEdit dataset with the load_dataset from the datasets, has the. jsonl file been loaded? If not, how can I configure a. jsonl file for it?