About the different version of models and datasets.

Thank you for your kind words about the project!

Here's a breakdown of the datasets and differences between them:

Complete Dataset: This includes 4 million freeform image editing entries generated by our pipeline. It is part of the broader UltraEdit initiative.
UltraEdit_Region-Based_100k: This subset supports region-based image editing and includes a mask image for each editing pair. It's designed for tasks where specific regions of an image are targeted for editing.
UltraEdit_500k: This is a sampled subset of the complete dataset, containing 500k entries of freeform image editing data. We created this subset to maintain a comparable size with other similar datasets and to facilitate evaluation and ease of use.

Regarding your questions about the specific models:

SD3-UltraEdit_freeform: This model is trained exclusively with the Freeform image editing dataset, which contains 4 million entries.
SD3-UltraEdit w_mask: This model is trained using both the freeform (4M) and region-based (100K) image editing data. It supports both freeform and region-based image editing.
SD3-Ult Edit_mask: This appears to be an accidentally uploaded empty folder. We will remove it shortly.

Please feel free to reach out if you have any further questions!

HaozheZhao / UltraEdit