cuda out of memory - Githubissues

Hello,

I was trying to rerun the pretrained models and I got this traceback:

Running pipeline on dev set. Data directory already exists. Skip download. Model rationale_roberta_large_scifact already exists. Skip download. Retrieving oracle abstracts. Selecting rationales. Using device "cuda" Traceback (most recent call last): File "verisci/inference/rationale_selection/transformer.py", line 30, in model = AutoModelForSequenceClassification.from_pretrained(args.model).to(device).eval() ........... return t.to(device, dtype if t.is_floating_point() else None, non_blocking) RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 3.95 GiB total capacity; 786.99 MiB already allocated; 14.50 MiB free; 792.00 MiB reserved in total by PyTorch)

It used to run perfectly a week ago, and I have not changed anything since. Is there any problem if you run a lot of times the models' inference??

I tried various solutions I found online, such as torch.cuda.empty_cache() or torch.utils.checkpoint, but I get the same error. I am using torch 1.5.0 and the T4's on the cluster that I am using have 16 GB of memory).

Any help would be more than appreciated. Thank you very much for your time.

allenai / scifact

cuda out of memory #15