Open syguan96 opened 3 months ago
Hi
Thank you for your kind words about the project!
Here's a breakdown of the datasets and differences between them:
Complete Dataset: This includes 4 million freeform image editing entries generated by our pipeline. It is part of the broader UltraEdit initiative.
UltraEdit_Region-Based_100k: This subset supports region-based image editing and includes a mask image for each editing pair. It's designed for tasks where specific regions of an image are targeted for editing.
UltraEdit_500k: This is a sampled subset of the complete dataset, containing 500k entries of freeform image editing data. We created this subset to maintain a comparable size with other similar datasets and to facilitate evaluation and ease of use.
Regarding your questions about the specific models:
SD3-UltraEdit_freeform: This model is trained exclusively with the Freeform image editing dataset, which contains 4 million entries.
SD3-UltraEdit w_mask: This model is trained using both the freeform (4M) and region-based (100K) image editing data. It supports both freeform and region-based image editing.
SD3-Ult Edit_mask: This appears to be an accidentally uploaded empty folder. We will remove it shortly.
Please feel free to reach out if you have any further questions!
Hi @HaozheZhao, this is a great work. I tried to filter out some categories to train Instructpix2pix.
I noticed that you have released "UltraEdit_500k", "UltraEdit_Segion-Based_100k", and the complete dataset. Can you tell me how to divide these subsets? If possible, could you tell me the difference between "BleachNick/SD3-UltraEdit_freeform", "BleachNick · SD3-UltraEdit w_mask", and "BleachNick/SD3-Ult Edit_mask"?
Thanks for your help!