How long will the sample progress end?

xichenpan / ARLDM

Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

https://arxiv.org/abs/2211.10950

MIT License

182 stars 28 forks source link

How long will the sample progress end? #9

Closed pokameng closed 1 year ago

pokameng commented 1 year ago

How long will the sample progress end? It maybe so slowly,why? @Flash-321

xichenpan commented 1 year ago

@pokameng Hi, it is too slow. In our original implementation, sample each story takes around 30-40 secounds. I am wondering if the time cost for the first batch contains the prefetch time and hdf5 loading time, what about the time cost for the other batch? Also, in the sample mode, you could using ddp and also increase the batch size. For acceptable sample quality, you could try to set the guidance scale to 7.5, and the steps to 50, using a pndm scheduler, it can greatly reduce the sample time.

pokameng commented 1 year ago

@Flash-321 Can I train and sample at the same time?

xichenpan commented 1 year ago

@pokameng Sure, you can simply modify the code in https://github.com/Flash-321/ARLDM/blob/eb907e3717ac20f82dfba8e67fd55d95127de098/main.py#L309-L311

def validation_step(self, batch, batch_idx):
     original_images, images = self.sample(batch)
     grid = torchvision.utils.make_grid(sample_imgs) 
     self.logger.experiment.add_image('generated_images', grid, 0)

also make sure to enable this module during training https://github.com/Flash-321/ARLDM/blob/eb907e3717ac20f82dfba8e67fd55d95127de098/main.py#L82-L97 but it will slow the training process, we recommend manually run a job to sample images.

pokameng commented 1 year ago

What I mean is that I have now saved a ckpt and the training process is still executing, I have reopened another job to sample the saved ckpt, but I am worried that the ckpt will be overwritten by the new training process saved ckpt @Flash-321

xichenpan commented 1 year ago

What I mean is that I have now saved a ckpt and the training process is still executing, I have reopened another job to sample the saved ckpt, but I am worried that the ckpt will be overwritten by the new training process saved ckpt @Flash-321

@pokameng It doesn't matter, once your ckpt is loaded, the sample job do not rely on it anymore. You can also copy it to another folder to avoid this situation.

pokameng commented 1 year ago

ok thanks!!!

rehammsalah commented 1 year ago

what is the test_model_file ?

xichenpan commented 1 year ago

@rehammsalah here, https://github.com/xichenpan/ARLDM/blob/main/config.yaml#L25