louaaron / Score-Entropy-Discrete-Diffusion

[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
https://aaronlou.com/blog/2024/discrete-diffusion/
MIT License
352 stars 33 forks source link

Training Time #7

Open Henry839 opened 4 months ago

Henry839 commented 4 months ago

Dear authors, Could you please offer the training time of SEDD on 8 A100 GPUs? Thanks so much.

Vishal-S-P commented 3 months ago

Hi @Henry839 , Did you figure out the training time? or time per 50 steps?

huanranchen commented 1 month ago

Hi @Henry839 , Did you figure out the training time? or time per 50 steps?

xinlong-yang commented 6 days ago

Hi @Henry839 , Did you figure out the training time? or time per 50 steps?

on 40G A100, it takes about 10s per 50 steps with batch_size = 16