How large GPU is expected to train GSum model?

neulab / guided_summarization

GSum: A General Framework for Guided Neural Abstractive Summarization

MIT License

113 stars 27 forks source link

How large GPU is expected to train GSum model? #27

Closed yfqiu98 closed 3 years ago

yfqiu98 commented 3 years ago

Hi, I have noticed in your script that you use 2 GPU with 1024 tokens per each, and keep batch size as the same as bart to train the model. Can I ask what is the memory for your GPU? I use several 16G gpus which can train the bart but not work for GSum.

Thanks

zdou0830 commented 3 years ago

Hi, we found that for the BART-based models we need GPUs with more than 32GB memory.

nargesdel commented 1 year ago

Hi, we found that for the BART-based models we need GPUs with more than 32GB memory.

@zdou0830 Hello, thank you very much for your reply. If there are GPUS availabe with 24G memmory, how could we specify the number of GPU? How many GPUs do we need? Do we need to change MAX_TOKENS in z_train.sh? If so what value will work for that?

I highly appreciate your response. Thanks