chansey0529 / LSO

The official Pytorch implementation of our paper Where is My Spot? Few-shot Image Generation via Latent Subspace Optimization, CVPR 2023.
11 stars 0 forks source link

Custom Dataset Training #5

Open Jmipar-k opened 5 months ago

Jmipar-k commented 5 months ago

Hello!

I have seen your great work!

I am trying to apply this to a medical application.

I would like to know

  1. If training with my own medical custom dataset is possible.
  2. If I would need to train new stylegan generators and discriminators.
  3. If so, how I could get inverted latent codes about my dataset.

Thank You!

chansey0529 commented 5 months ago

Hello! Thanks for your attention. Q1 & Q2: Yes, it is possible to train on the custom datasets. This involves several steps. First, the dataset needs to be separated into a seen subset and an unseen subset according to the split of seen/unseen categories. Second, train the StyleGAN generator and discriminator with the seen samples. Third, inverse the training set of unseen samples to the latent space (following the response to Q3). Last, apply our code for synthesizing images for the unseen category.

Q3: This can be related to the #4. Simply speaking, the latent code can be obtained by replacing the pretrained StyleGAN in ii2s and optimizing for each unseen sample.

If you are implementing the method for a quantitative evaluation, we sincerely recommend that you organize the dataset, pretrained StyleGAN2, and inverted latent codes as the provided ones to reproduce the few-shot generation procedure. If you have further questions, feel free to contact us. We wish you success in reproducing.

Jmipar-k commented 5 months ago

I am thankful for your quick reply!

I have an additional question to ask.

I have only 100 images in total, and if I split them into 99/1 or 97/3, do you think that 97-99 images(seen subset) would be a sufficient number of images to train the stylegan generator/discriminator?

chansey0529 commented 5 months ago

This is an extreme case. I believe that StyleGAN can train on approximately 100 images with the aid of the ADA technique. However, I can not guarantee that the latent space of the optimized GAN is robust enough to perform a GAN inversion with only around 100 samples.