THUDM / Inf-DiT

Official implementation of Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer
Apache License 2.0
378 stars 19 forks source link

About speed and GPU memory? #4

Open peki12345 opened 6 months ago

peki12345 commented 6 months ago

When I super-resolution a 1024 1024 image into 4096 4096, I used 70GB of GPU memory and spent 18 minutes, which seems to contradict the advantages stated in the paper. I wonder if this is normal?

zdyshine commented 6 months ago

Same question, on 3090 graphics card, 512x512 image as input, using approximately 22GB of memory and inference takes 20 minutes

yzy-thu commented 6 months ago

@peki12345 Fig. 2 shows the memory usage when parallel size=1. You can change the default inference_type in the script to reduce memory usage. We tested the speed and GPU usage of various models on the H800. For Inf-DiT, the settings are: Inference_type="ar2", block_batch (parallel size)=16, step=15. We will update this table in the paper later. image

yzy-thu commented 6 months ago

The speed of Inf-DiT is indeed relatively slow because it is a pixel space diffusion model. We look forward to future work that can improve it to a latent space.

bruinxiong commented 5 months ago

Same question, on 3090 graphics card, 512x512 image as input, using approximately 22GB of memory and inference takes 20 minutes @zdyshine Hi, did you optimize the code of inference stage? In my case, I met 'out of memory' on my 4090 graphics card with 512x512 image.