NVlabs / stylegan3

Official PyTorch implementation of StyleGAN3
Other
6.28k stars 1.1k forks source link

Immediate model collapse on training conditional model. Does not happen when trained unconditional on same sourcedata. #627

Open idzard-intentdev opened 7 months ago

idzard-intentdev commented 7 months ago

Describe the bug When having trained a conditional model ( --cond True --cfg=stylegan2), using only two class labels, the model generates the same image for every seed. (Class selection works fine). This issue does not happen when training unconditional model on the same source data. I have a feeling I'm doing something wrong, but can't find the solution, as it seems I do everything according to the documentation. Perhaps somebody can shine a light...

To Reproduce Steps to reproduce the behavior:

  1. For testing I created a very small dataset.zip, which contains 10 *.PNG images in the folder "00000". ( small image set is only for this bug demonstration - I first did a conditional training using 10k images which had this same issue) The dataset.json in the root of the zipfile contains this for the 10 images:
    
    {
    "labels": [
        [
            "00000/img00000000.png",
            0
        ],
        [
            "00000/img00000001.png",
            0
        ],
        [
            "00000/img00000002.png",
            0
        ],
        [
            "00000/img00000003.png",
            0
        ],
        [
            "00000/img00000004.png",
            0
        ],
        [
            "00000/img00000005.png",
            1
        ],
        [
            "00000/img00000006.png",
            1
        ],
        [
            "00000/img00000007.png",
            1
        ],
        [
            "00000/img00000008.png",
            1
        ],
        [
            "00000/img00000009.png",
            1
        ]
    ]
    } 

 2. I train the conditional model like this:
  In 'stylegan3' directory, run command 'python train.py --cond True --kimg=15000 --outdir xxxx --data xxxx --cfg stylegan2 --gpus=8 --batch=32 --gamma=10 --snap=20 --mirror=1 --metrics fid50k_full'

3. Error: after training, the class label works fine to select what class image to output, but when changing seeds the image stays (almost) the same.
(This does not happen when training unconditional model on same sourcedata)
When I inspect the fakes***.png files, I see that during training, very soon all images of the same class start to become very similar, until they are almost the same after a few kimg. (see screenshots below)

**Expected behavior**
I would expect to get a different output image for each seed

**Screenshots**
At fakes0000.png each seed has a different image:
![image](https://github.com/NVlabs/stylegan3/assets/63593546/fe717797-c7b2-4eb2-b03d-95c3cd71cbb4)

 At fakes00400.png all images from the same class are already almost the same
![image](https://github.com/NVlabs/stylegan3/assets/63593546/15f520a5-5c6d-4c93-a7fb-9f1346dab99a)

 At fakes03120.png all images from same class are the same
 ![image](https://github.com/NVlabs/stylegan3/assets/63593546/07744b88-48eb-4f58-bc39-f7ca6bd1843c)

**Desktop:**
 - OS: Linux Ubuntu 23.04
 - PyTorch version: 2.1.0
 - CUDA toolkit version: 11.8
 - NVIDIA driver version: 535.129.03
 - GPU: 8x A6000
 - Docker: No

thanks for any help!