AAnoosheh / ToDayGAN

http://arxiv.org/abs/1809.09767
BSD 2-Clause "Simplified" License
172 stars 32 forks source link

Regarding Training dataset folder names #26

Closed shivom9713 closed 3 years ago

shivom9713 commented 3 years ago

I want to train the pre-trained model (150 epoch pre-trained) further on my custom data set. How should I name the folder names for training? I tried putting my day images in "train0" and night images in "train1". But while training I see domain A sometimes being night, and sometimes being day in the checkpoint folder. This is confusing the generator to train in one direction at some epoch and the other direction in rest of the epochs .Please help me suggesting correct solution so that i can train in a consistent manner

I highly doubt that the algorithm makes it pick by "ascending name order" from both folders asdasd

AAnoosheh commented 3 years ago

Hi, thanks for the issue.

At least according the python's sorted function, it should load the folders in order train0, then train1: https://github.com/AAnoosheh/ToDayGAN/blob/12de3af4f209cd227bdc54160fc3164f97f6732d/data/unaligned_dataset.py#L15

If you're sure your folders don't contain mixed up night/day images, then the only other thing i can think of right now is if you have more than these two folders inside the dataroot. Technically my code comes from the ComboGAN repo which allows for any number of folders to be loaded (and will load them all, but only use the first two).

It does not load the images by filename from inside the folders. It's only loading the folders inside your dataroot by alphabetical order.

shivom9713 commented 3 years ago

Thanks for the response. I read the paper and subsequently trained some 4500 images from night to day . Actually, I am using the model to improve car object detection in the night time, But the issue is , converted cars look animated(not real) . Can you suggest something to improve conversion mostly focussed on generating better quality cars in the image

You can check the results in this PPT Test results on ToDAY.pptx

I have trained on the pre-trained model further for around 15 epochs

AAnoosheh commented 3 years ago

Cool work. I'd say given that mine was trained on a very specific dataset with unique roads/cars/lighting/camera parameters, maybe try training for more than 15 epochs – I would say even around 35-40.

(This model will almost never overfit, so you can go as long as your compute requirements allow you to, actually)

In general, though, the results will never be "completely" photorealistic, and it's questionable as to whether this can generate a reliable day-night paired synthetic dataset.

Maybe though by upgrading this framework with modern-day advancements in GANs + multi-modal features, it could generate higher-fidelity images with different modalities (i.e. multiple versions of day images per night image)

I would also try training from scratch instead of fine-tuning too, just to see if it does better or worse.

shivom9713 commented 3 years ago

Thank You very much :) Sure, I will try training from scratch and see. Can you suggest what lambda_latent and lambda_forward do? Can they help in improving car conversion quality
And what should be their values (just for a rough idea)? I tried adding them but the results were not great so I just went on with --lambda_identity=5 and lambda_cycle: 10.0 Thanks a Lot

AAnoosheh commented 3 years ago

They were there to try and help with stuff from the ComboGAN project I did before this.
Because that project aims to do translation among any number N domains, it sometimes benefits from extra regularization and losses. But we'll see we don't need them here.

lambda_latent puts an identity loss on the halfway-network features of an image, which we call z here:
A -> zA -> B -> zB -> A'
The normal lambda_cycle is between A and A', while lambda_latent would be between zA and zB
(This is not useful in our case.)

Then lambda_forward is between A and B, which seems weird, but is useful if you want to keep some information between both images, like color (so we actually don't want to use it in this project.)

Finally lambda_identity does a separate operation A -> A'' using A's encoder and decoder directly, skipping B altogether, and then operates on A and A''. (It shouldn't really help much here, in theory.)

shivom9713 commented 3 years ago

AAnoosheh Thanks a Lot for your effort and time!!