Open Dchenlittle opened 7 months ago
Hey! The masks are dynamically generated. The only inputs are the images/captions.
Thanks you very much for your answer. Can this training script be applied to other datasets mask can also be generated dynamically? If it is applied to other datasets, I need to provide the labelled mask or I need to modify the code for dynamic mask generation for the corresponding dataset, right?
Hey! The masks are dynamically generated. The only inputs are the images/captions.
You can specific the mask using --mask_mode
(from defined in MASK_MODES
) which defaults to the masks I typically use 512train-large
. If you want to further customize the masks you'll need to modify generate_mask(x, mask_mode)
.
You can specific the mask using
--mask_mode
(from defined inMASK_MODES
) which defaults to the masks I typically use512train-large
. If you want to further customize the masks you'll need to modifygenerate_mask(x, mask_mode)
.
Now if I want to train from scratch using your script and your dataset. I need to first download the dataset you linked to and then run the build_text2rgb_captions.py file to generate the captions data. Next, run the build_text2rgb_dataset.py file to produce the metadata.jsonl file. Finally, run the train_text... _inpaint.py file. Is this process correct? Is there anywhere I'm misunderstanding?
So far, I've downloaded the dataset in sentinel-2-rgb-captioned, the weights file stable-diffusion-2-inpainting and this training code too. I would like to ask what the train.... . inpaint.py file what are the input parameters to be filled in, can you tell me more about it?
In the file https://github.com/sshh12/terrain-diffusion/blob/main/scripts/train_text_to_image_lora_sd2_inpaint.py you can see "Example Usage"
In the file https://github.com/sshh12/terrain-diffusion/blob/main/scripts/train_text_to_image_lora_sd2_inpaint.py you can see "Example Usage"
Thanks you for your hlep. I have successfully run this train code. And i have a question about “Does the training process need to use the official v1-inpainting-inference. yaml file?”
No yamls needed. This is all though the diffusers library which does not use those configs.
Can you tell me if the training input for this training file consists of the original image, the mask image and the masked image? Also, if yes, is the mask image provided by the dataset labels or is it automatically generated during the training code? I request you to help me with this question