CompVis / latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models
MIT License
11.58k stars 1.51k forks source link

Training speed. #373

Open andreYoo opened 3 months ago

andreYoo commented 3 months ago

I am using two A6000 RTX GPUs.

When I train the autoencoder models with the ImageNet dataset.

it shows 2.14s/it.

is it normal?

hahamidi commented 2 months ago

As I remember, "s/it" stands for seconds per batch, which greatly depends on your batch size. With a standard batch size, yes, this duration is typical. Additionally, it's highly dependent on the number of workers you have in your dataloader.