VSehwag / minimal-diffusion

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)
MIT License
244 stars 38 forks source link

about training on Celeba #7

Open Thekey756 opened 4 months ago

Thekey756 commented 4 months ago

Thank you for providing code, I'm trying to train the model on Celeba, I have downloaded the dataset and found it has no labels, however in the function get_metadata() in file data.py, code is written as below:

 elif name == "celeba":
        metadata = EasyDict(
            {
                "image_size": 64,
                "num_classes": 4,
                "train_images": 109036,
                "val_images": 12376,
                "num_channels": 3,
            }
        ) 

I dont know if you did some modification on Celeba, as the total dataset contains 200K+ images while you only use 109036 images as train_iamges, and the num_classes equals 4 is also my question.

In addition, I found the script used for training on Celeba:

CUDA_VISIBLE_DEVICES=4,5,6,7 python -m torch.distributed.launch --nproc_per_node=4 --master_port 8107 main.py \
    --arch UNet --dataset celeba --class-cond --epochs 100 --batch-size 128 --sampling-steps 50 \
    --data-dir ~/datasets/misc/celebA_male_smile_64_balanced/train/ 

In the Celeba I downloaded ,there are no folder named celebA_male_smile_64_balanced/train

Hoping for your reply, thanks in advance.

VSehwag commented 1 week ago

Hi, I used a custom celebA conditioned dataset to train this model. This was done by using the following two attributes: 1) gender (male/female) and 2) Smiling/not-smiling. This is the reason you also see four classes in the celebA dataset. Feel free to construct any other variation of the celebA dataset for the conditional diffusion training.