Closed universewill closed 11 months ago
Based on my understanding, the training process for ControlNet involves generating conditioning images from certain ground truth images through methods such as using Canny edge detection to obtain the image outlines, along with an image prompt. These conditioning images and prompts are then used as inputs to the diffusion model to generate images aligned with the ground truth images. The entire process likely does not require human feedback or involvement, which might render our approach unsuitable.
Based on my understanding, the training process for ControlNet involves generating conditioning images from certain ground truth images through methods such as using Canny edge detection to obtain the image outlines, along with an image prompt. These conditioning images and prompts are then used as inputs to the diffusion model to generate images aligned with the ground truth images. The entire process likely does not require human feedback or involvement, which might render our approach unsuitable.
Sorry,i didn't describe clearly.What i mean is, i want to use dpo to finetune my trained controlnet to learn from human feedback data to get better result. Is that possible with d3po?
I understand, that's possible. You can import your trained ControlNet model as a pre-trained model, modify the code to take prompts and conditioning images as input, and then run scripts/sample.py
to generate the corresponding images. After receiving human feedback results saved as a JSON file, you can fine-tune your ControlNet model by running scripts/train.py
, enabling your model to generate images that better align with human preferences. Good luck!
Can it be used to train controlnet ?