jolibrain / joliGEN

Generative AI Image Toolset with GANs and Diffusion for Real-World Applications
https://www.joligen.com
Other
238 stars 31 forks source link

Advice about training result #585

Open YoannRandon opened 10 months ago

YoannRandon commented 10 months ago

Hi, I have started a training recently and got some result which are not that great on the synthetic to realistic image style transfert. I use the master branch of the code and got those result on the loss function. image

YoannRandon commented 10 months ago

And here are some example of what is generated : image

YoannRandon commented 10 months ago

it's already the 50+ epochs, and it seemed the model doesn't improve much. (the problem surely come from the lack of diversity of the training data 3568 synthetic/ 6519 realistic) but is there something i can do to improve the model performance ? I use mask (not bounding box) for the training.

YoannRandon commented 10 months ago

my cmd for training image

YoannRandon commented 10 months ago

my train config file seg_sem.json

beniz commented 10 months ago

I believe we had this conversation before, right ? This dataset cannot lead to much results IMO, I can find and resend my email about it if needed. Algorithms cannot compensate for badly set data. There may be ways to find hyper-parameters that do better than others, but at the margin only.

beniz commented 10 months ago

If you wish to test on a dataset from the literature, gta5 to cityscapes is one, and there are others.

YoannRandon commented 10 months ago

Is there a specific methodology? During our last meeting I think I heard that you "restart the training", Is it related to when the model is doing poorly at some epochs which is detected visually throught the loss function or is at a specific way of training or do you just train in one go and change parameters according to your results after a training is over?

YoannRandon commented 10 months ago

I'll try on gtaV2cityscape as an external reference for the doability on this specific uc.

beniz commented 10 months ago

image

This is to illustrate my argument: synth data has no trees/natural life basically, so the discriminator is pushing for them on fake images. Which is not what the goal is here. This is because synth and real domain need to have a purpose one to the other in reality.

Also, classes between synth and real domains may not be the same, which is wrong and makes semantic conservation much more difficult.

image (different colors mean different classes)