Question about hardware specs

Raldir / FEVEROUS

Repository for Fact Extraction and VERification Over Unstructured and Structured information (FEVEROUS), accepted to NeurIPS 2021 Dataset and Benchmarks and used for the FEVER Workshop Shared Task at EMNLP2021.

Apache License 2.0

67 stars 20 forks source link

Question about hardware specs #19

Closed creisle closed 3 years ago

creisle commented 3 years ago

@Raldir I am trying to run the fine-tuning training for the verdict prediction model but I keep running into CUDA memory issues. Do you remember what hardware specifications were required when you ran this?

Raldir commented 3 years ago

Training the verdict prediction model with the default parameters requires substantial GPU memory. The baseline was trained on a Quadro RTX 8000 with 48GB of memory and about 38 - 40GB were needed. However, you might just want to try to reduce the batch size and increase the gradient_accumulation_steps accordingly.

creisle commented 3 years ago

Thanks for getting back to me on this! I actually got around this eventually but I had to update transformers to v4.11 so I could use the gradient checkpointing option. Then I was able to train it with < 24 GB GPU memory.