lyutyuh / ASP

PyTorch implementation and pre-trained models for ASP - Autoregressive Structured Prediction with Language Models, EMNLP 22. https://arxiv.org/pdf/2210.14698.pdf
MIT License
100 stars 15 forks source link

GPU Memory #7

Open sherylxun opened 1 year ago

sherylxun commented 1 year ago

How much GPU memory is required for model training? If the GPU memory is not enough, how can model parameters be optimized (batch_size has been set to 1)?

Niklss commented 1 year ago

How much GPU memory is required for model training? If the GPU memory is not enough, how can model parameters be optimized (batch_size has been set to 1)?

I've been able to start training t5_large based ere model with flant5_large_conll04 config (except use_amp, I changed it to false. V100 is not working with bf16). It requires 25870MiB of GPU memory.