Training tips - Githubissues

BartvanMarrewijk commented 3 years ago

Hi @huuquan1994 ,

I have trained cycle gan, leaf gan and a custom leaf gan using a detectron segmentator succesfully on the 2020 plant pathology dataset. I only used 2 classes of this dataset: healthy or scab. Using this data I trained resnet50 which was used as a benchmark. Then I training this resnet again but now with extra data from one of the gans (50% of train healthy images were converted to scab images). For validation the 2021 plant pathology dataset was used.

Unfortunately, in none of the GAN datasets the classification accuracy improves. At first I thought, that some images were no transferred well. But even after removing the unclear images and training the classifier again the result was not yet better than the benchmark.

Do you have any suggestions like training parameters? I know that in your paper you refer to the cycle gan paper, but even in that paper, the training tips are minimal. Did you change any parameters to get the result as described in your paper. Or do you think the added value of GAN is especially interesting when datasets are unbalanced?

huuquan1994 commented 3 years ago

Hi @studentWUR,

Thanks for your question! The key idea of LeafGAN is to increase the variety of backgrounds in training data. It depends on the diversity of the healthy input images (as we use it to transfer to fake diseases). I've taken a look at the 2020 Plant Pathology Dataset. From my point of view, the healthy apple images seem to not have a wide variety of backgrounds and are quite similar to the disease's backgrounds. (Again, this is just my first impression since I didn't check the details on this dataset) *healthy apple images taken from this post

Therefore, I'm not sure if LeafGAN works well on this dataset. In this case, I think if we can somehow segment the leaf area and "paste" into different backgrounds (not only apple but also like wild backgrounds) then the performance should be improved. Or maybe using more heavy augmentations Albumtations could help.

Do you have any suggestions like training parameters? I know that in your paper you refer to the cycle gan paper, but even in that paper, the training tips are minimal. Did you change any parameters to get the result as described in your paper. Or do you think the added value of GAN is especially interesting when datasets are unbalanced?

In our experiments, we fine-tuned the pretrained ResNet-101 using the SGD optimizer with lr=0.001 and momentum=0.9 Here were the training hyperparameters.

optimizer_ft = torch.optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)
exp_lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

Input image is normalized using mean and std of ImageNet dataset as:

torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

Also, the training datasets of our classifiers were balanced. Specifically,

Baseline: Train on 4 diseases, 2000 images each.
Baseline+CycleGAN/LeafGAN: Train on 4 diseases, 2000+717=2717 images each.

I hope this might help you and good luck with your model.

BartvanMarrewijk commented 3 years ago

Hi @huuquan1994 ,

Thank you for the elaborate answer.

Therefore, I'm not sure if LeafGAN works well on this dataset. In this case, I think if we can somehow segment the leaf area and "paste" into different backgrounds (not only apple but also like wild backgrounds) then the performance should be improved. Or maybe using more heavy augmentations Albumtations could help.

I fully agree that this could work, but then the higher performance is probably mainly caused by this augmentation instead of a GAN. My hypothesis was actually that by using LeafGAN we can generate examples that have the same background but the leaf itself is different; healthy or scab. Consequently, I expected that classifier should learn better, because the only differences between these images are the symptoms on the leaf.

Also, the training datasets of our classifiers were balanced. Specifically, Baseline: Train on 4 diseases, 2000 images each. Baseline+CycleGAN/LeafGAN: Train on 4 diseases, 2000+717=2717 images each.

Thank you for the clarification and training suggestions. Actually a nice test for the LeafGAN dataset is to determine the increase in performance when only the 717 healthy images were added to the baseline. In the table below, my results improved when I would compare with the original dataset (Original (only fgvc7) VS GANs). However, when I added the 64 healthy images to original data set (Original (only fgvc7) + 64 healthy). The results of the GAN are actually disappointing. Maybe I should have done the detectron GAN with more classes. And I agree with you when the background of the scab images would be different then the added value of LeafGAN would increase.

Validation on fgvc8 (713, 670) | Train_size (healthy, scab) | Best.pt | Last.pt -- | -- | -- | -- Original (only fgvc7) | 364 / 412 | 72.0 | 71.4 Original (only fgvc7) + 64 healthy | 418 / 412 | 77.3 | 71.8 Cycle gan +100% validA->B | 364 / 466 | 73.5 | 67.2 Leaf gan +100% validA->B | 364 / 466 | 73.02 | 68.18 Detectron gan+100% validA->B | 364 / 466 | 73.5 | 69.0

IyatomiLab / LeafGAN

Training tips #6