explainingai-code / DDPM-Pytorch

This repo implements Denoising Diffusion Probabilistic Models (DDPM) in Pytorch
43 stars 6 forks source link

what changes would we need to do if we used our own dataset? #1

Open awais00012 opened 7 months ago

awais00012 commented 7 months ago

Thanks for the awesome explanation. Could you tell me which changes we need before training the model on our data?

explainingai-code commented 7 months ago

Hello,

Thanks for the appreciation. I apologize that should have been part of the README , I have updated it now. Can you take a look - https://github.com/explainingai-code/DDPM-Pytorch/blob/main/README.md#training-on-your-own-images and let me know in case you face any issues

awais00012 commented 7 months ago

thanks for the modification in the repo for training the model on custom dataset. however i am facing this issue when i triad the model on my own data. my dataset with the name of ultrasound256CH1 contain train and test images. all the images have 256*256 size, channel 1. image

explainingai-code commented 7 months ago

Can you tell me the im_path value you used in the config ? And also the directory structure of your dataset. Is it $REPO_ROOT/ultrasound256CH1/train/*.png ?

The error basically means that the code wasn't able to find any png files in the location it was searching.

awais00012 commented 7 months ago

Can you tell me the im_path value you used in the config ? And also the directory structure of your dataset. Is it $REPO_ROOT/ultrasound256CH1/train/*.png ?

The error basically means that the code wasn't able to find any png files in the location it was searching.

yes the dataset is in the repo root, still getting this error, kindly how can i solve it. image

image

explainingai-code commented 7 months ago

Got it. Create a subfolder 'images' inside train directory and put all training png files in there. So $REPO_ROOT/ultrasound256CH1/train/images/*.png

Leave the config as it is to point to "ultrasound256CH1/train" . Can you try that and let me know if it works?

awais00012 commented 7 months ago

yes i tried, unfortunately it does not work.

explainingai-code commented 7 months ago

Can you print the directory and path the code is searching at https://github.com/explainingai-code/DDPM-Pytorch/blob/main/dataset/mnist_dataset.py#L40 and share that.

print(d_name, os.path.join(im_path, d_name, '*.{}'.format(self.im_ext))) Also comment line https://github.com/explainingai-code/DDPM-Pytorch/blob/main/dataset/mnist_dataset.py#L42

awais00012 commented 7 months ago

that error has been resolved. that error was occurring because of the arrangement of the dataset. i created 5 classes of the different images for the train data and used "data/train/" as a path and its worked. now encountering this error: image

explainingai-code commented 7 months ago

You are training on cpu as of now right ? Also can you confirm if your conda environment has python3.8 and have the requirements installed as mentioned in https://github.com/explainingai-code/DDPM-Pytorch/tree/main?tab=readme-ov-file#quickstart

awais00012 commented 7 months ago

hi sir, i kept the batch size 10, just want to run for 40 epochs and the total images are only 828. could you pleases tell me why the model required so heavy computational power(memory) and how can i handle this issue?

RuntimeError: CUDA out of memory. Tried to allocate 640.00 GiB (GPU 0; 14.75 GiB total capacity; 2.16 GiB already allocated; 11.63 GiB free; 2.17 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF.

explainingai-code commented 7 months ago

Its because the images are 256x256 and by default the model config does downsampling only twice. Few things you can try to ensure you are able to train.

  1. Resize the images to 64x64 in the loader and train diffusion model on these 64x64 images
  2. Have all three down blocks to downsample by placing down_sample : [True, True, True] in the config
  3. Try with num_mid_layers : 1 in the config
  4. Reduce number of Midblocks by changing mid_channels : [256, 128] in config

I think that should reduce the model size considerably and should allow you to train.

awais00012 commented 6 months ago

Thanks, the model works very well i trained on my datasets. I would request you to add a few more things to the repo for better results and comparative analytical analysis between the original and the generated images like classifier free guidance, exponential moving average, IS and FID score. looking forward to your outstanding and easy implementation!

explainingai-code commented 6 months ago

Yes, I wanted to have this repo as a intro to diffusion which is why didn't want to add those and leave this as a bare minimum diffusion repo. I do plan to create a stable diffusion repo which should have some of these incorporated. Once that is done I will try to put the parts you mentioned here as well (if I am able to do that without adding too much complexity to the current implementation ).

thatdev6 commented 4 months ago

Hello,

I made all the relevant changes mentioned in the readme and in this thread but after my images are loaded I get an AttributeError.

image

explainingai-code commented 4 months ago

Hello @thatdev6 , this code expects the path to have png files. But seems like thats not the case for the path you have provided. Is it npy file? Cause In that case you would have to change this line

thatdev6 commented 4 months ago

Hello @thatdev6 , this code expects the path to have png files. But seems like thats not the case for the path you have provided. Is it npy file? Cause In that case you would have to change this line

No my path has png files

image

explainingai-code commented 4 months ago

Are you using the same code or have you made some modifications ? Your list at the end of dataset initialization is a list of numpy.ndarray objects(according to the error), which cannot be because the dataset class during initialization just fetches the filenames. Also only 19 training images?

thatdev6 commented 4 months ago

Are you using the same code or have you made some modifications ? Your list at the end of dataset initialization is a list of numpy.ndarray objects(according to the error), which cannot be because the dataset class during initialization just fetches the filenames.

Yes, I modified the loader function to load and downsample my images. They are rectangular and have a jpg format I figured out my mistake and it has corrected

This is how modified the loader function image

I also changed the im channels to 3, Now I get a runtime error while training image image

explainingai-code commented 4 months ago

The shapes of two images that your dataset returns are different (3x3264x2448 and 3x2448x3264).

thatdev6 commented 4 months ago

Before converting to tensor by any chance did you forget to convert the numpy arrays from HxWx3 to 3xWxH ?

How would i fix that?

thatdev6 commented 4 months ago

I also modified the sample function for rectangular images image

explainingai-code commented 4 months ago

I dont think the 3xwxh is an issue because the error says that your image shapes are 3xWxh so thats fine. But I think your path does not have all same size images. Some images are 3264x2448 and some are 2448x3264 . Can you check this.

thatdev6 commented 4 months ago

I dont think the 3xwxh is an issue because the error says that your image shapes are 3xWxh so thats fine. But I think your path does not have all same size images. Some images are 3264x2448 and some are 2448x3264 . Can you check this.

Yes, I think your right so the solution would be to downsample all of them to 64x64?

explainingai-code commented 4 months ago

Yes center square crop of (2448x2448) and then resize to 64x64. How many images are there in your dataset?

thatdev6 commented 4 months ago

Yes center square crop of (2448x2448) and then resize to 64x64. How many images are there in your dataset?

Around 600 images

thatdev6 commented 4 months ago

These are changes i made to the loader and getitem function, I assume there is no problem here but for some reason the training gets interrupted (^C)

image image image

explainingai-code commented 4 months ago

Couple of things. Move the image reading to the data loader get_item method just like the code in repo. Simply collect the filenames from load_images method and nothing else. You can do the cropping and resize also in get_item method. Secondly can you check why its printing "Found 19 images" when actually it should be 600.

thatdev6 commented 4 months ago

Couple of things. Move the image reading to the data loader get_item method just like the code in repo. Simply collect the filenames from load_images method and nothing else. You can do the cropping and resize also in get_item method. Secondly can you check why its printing "Found 19 images" when actually it should be 600.

Okay so first of all i should leave the loader function as it is just modify for the jpg images, secondly i should do the image formatting in the get item function It says found 19 images because at the moment i have only uploaded a subset of the dataset, It was quite annoying to wait for the images to load only to encounter an error in training

thatdev6 commented 4 months ago

How do you suggest i fix this? image

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.75 GiB. GPU 0 has a total capacity of 14.75 GiB of which 57.06 MiB is free. Process 15150 has 14.69 GiB memory in use. Of the allocated memory 11.21 GiB is allocated by PyTorch, and 3.35 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

These are the modifications i made image image image

explainingai-code commented 4 months ago

Reduce the batch size to 16. That should work I think.

thatdev6 commented 4 months ago

Reduce the batch size to 16. That should work I think.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.00 GiB. GPU 0 has a total capacity of 14.75 GiB of which 2.31 GiB is free. Process 85462 has 12.44 GiB memory in use. Of the allocated memory 9.43 GiB is allocated by PyTorch, and 2.87 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

thatdev6 commented 4 months ago

Reduce the batch size to 16. That should work I think.

It started to train on batch size 4

thatdev6 commented 4 months ago

I cannot generate the results image

thatdev6 commented 4 months ago

I have also tried changing my dataset, the new data set had square images of size 128x128. Yet i encountered the same runtime error while generating the pictures

explainingai-code commented 4 months ago

Whats the shape of xt that you get here and noisy_im here ?

thatdev6 commented 4 months ago

Whats the shape of xt that you get here and noisy_im here ?

I was able to solve this one out. I had to reshape the tensors. But I do have a question will the reshaping affect my results? Although I made changes in the unet file and the scheduler file

thatdev6 commented 4 months ago

after training on the complete dataset this is my output(x0) image image

explainingai-code commented 4 months ago

I dont think you would need to do any reshaping. Can you just let me know the shapes of those two tensors? They should be of the same shape. These results are after how many epochs of training ?

thatdev6 commented 4 months ago

I dont think you would need to do any reshaping. Can you just let me know the shapes of those two tensors? They should be of the same shape. These results are after how many epochs of training ?

After 40 epochs of training, okay I am posting the shapes

thatdev6 commented 4 months ago

Whats the shape of xt that you get here and noisy_im here ?

Shape of noisy_im: torch.Size([4, 3, 64, 64]) Shape of xt: torch.Size([100, 3, 28, 28])

explainingai-code commented 4 months ago

So if you set xt as 100x3x64x64, you are saying you get an error?

thatdev6 commented 4 months ago

So if you set xt as 100x3x64x64, you are saying you get an error?

I didnt explicitly set it this from the unmodified code where i get the error when i start generating

explainingai-code commented 4 months ago

This was being it as 100x3x28x28 because the config has im_size=28 and the code was for training on mnist(28x28). It will be the same shape as your noisy images during training, so for your case xt should be number of samplesx3x64x64

thatdev6 commented 4 months ago

This was being it as 100x3x28x28 because the config has im_size=28 and the code was for training on mnist(28x28). It will be the same shape as your noisy images during training, so for your case xt should be number of samplesx3x64x64

okay that makes sense but I dont understand it my sample size is different than the configured sample size so do i have to change the configuration as at the moment i am generating 100 samples after my reshaping but i should be generating 4. did i get that right?

also does the reshaping affect my results?

explainingai-code commented 4 months ago

You can generate how many samples you wish, you would not need any reshaping anywhere. Assuming you have trained your diffusion model on images of size 3x64x64. Then xt during sampling should be Nx3x64x64 where N is the number of samples you want.

Can you clarify where/why you require reshaping? Maybe I am misunderstanding something.

thatdev6 commented 4 months ago

I cannot generate the results image

after i had trained my model and got to generation the code showed me this output, after debugging i saw that the sizes in the unet file were indeed different so i made the following chanages image

after this i got another error leading me to the linear scheduler file and there were to some size mismatches so i made the following changes image

after i this i could generate the images and they full of noise, so at the moment i am training on BW images to see if i get better results

(P.S i trained my model on im size 28 the default value that was set)

explainingai-code commented 4 months ago

Okay, I would suggest to remove all the resize portions that you added in both these places, the could should work without any of these. If its not then their might be some other issue but no resize is required anywhere other than dataset class. After you have made changes in data loader to resize the images(assuming its WxH ) You should train the model on 3xWxH and sample with xt being Nx3xWxH

thatdev6 commented 4 months ago

The BW images generated the same result this is at x0 image

thatdev6 commented 4 months ago

Okay, I would suggest to remove all the resize portions that you added in both these places, the could should work without any of these. If its not then their might be some other issue but no resize is required anywhere other than dataset class. After you have made changes in data loader to resize the images(assuming its WxH ) You should train the model on 3xWxH and sample with xt being Nx3xWxH

Okay, so firstly remove all the modifications in the unet and linear scheduler, but then how would i resolve the errors while generating? Also my dataset has images with varying sizes some are Size: 3264 x 2448 and some are Size: 2448 x 3264

explainingai-code commented 4 months ago

For images of different sizes, you would continue to do what you were already doing in your dataset class. That is centre crop and resize to 64x64. Then train your model on these 64x64 images and generate an image with xt as 64x64. To avoid any confusion you can also change the im_size here to 64(instead of 28).

So basically the only change from the repo code that you should need to do is that centre cropping. You dont need to actually train a model to test this, just run the sampling script on a randomly initialized model and see if it still throws an error. If it does just share it here and I will take a look.