Problem about Running Time on Dialogue dataset

Hi, thanks for your great work.

I am conducting the dialogue related experiments. Actually, I don't find the estimation about time for dialogue tasks(There are the discriptions and issues about QG and QQP). Additionally, it seems that the model should be trained at 140k steps, which means it may excute lasting 5+ days even using around 4 A100.

Would you like to share more detailed experience about the GPU resource settings and running time in different tasks? I supposed it may be the issue we should optimze.

Thanks

Shark-NLP / DiffuSeq

Problem about Running Time on Dialogue dataset #24