I tried training myself to train controlnet, but my results were poor on portraits and good on non-human images.
type
from
to
non-human
portraits
The questions that often come up on portraits are
The resulting images have weird colors, for example, like
The picture quality is low
I have used 400,000 image-text data to create 400,000 depth data and depth_leres data. I trained the model for 4 epochs with a batch size of 256 (using gradient accumulation) to obtain the results mentioned above.
I would like to have some empirical guidance on what may have gone wrong with my training. Do I need to add more data to improve the quality of the results? Is there any way to filter out low-quality images (like image with JPEG artifacts) in the dataset?
I tried training myself to train controlnet, but my results were poor on portraits and good on non-human images.
The questions that often come up on portraits are
I have used 400,000 image-text data to create 400,000 depth data and depth_leres data. I trained the model for 4 epochs with a batch size of 256 (using gradient accumulation) to obtain the results mentioned above.
I would like to have some empirical guidance on what may have gone wrong with my training. Do I need to add more data to improve the quality of the results? Is there any way to filter out low-quality images (like image with JPEG artifacts) in the dataset?