CUDA out of memory when using a 3090 gpu

liushz commented 1 year ago

My primary model is from GoogleDrive, which is a formality model provided by the author of the paraphrase model, the issue came when I was running the example .sh file of the Style Transfer folder Thanks a lot if anyone help me to fix it

>(base) root@84d353835da2:/workspace/mucoco# bash decode_example.sh data output plain debug plain
Some weights of the model checkpoint at /workspace/mucoco/primary_model were not used when initializing GPT2LMHeadModel: ['transformer.extra_embedding_project.bias', 'transformer.extra_embedding_project.weight']
- This IS expected if you are initializing GPT2LMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing GPT2LMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
batch_size为1
skip this example? Fears for T N pension after talks . Unions representing workers at Turner Newall say they are ' disappointed ' after talks with stricken parent firm Federal Mogul . [yes(y)/maybe(m)/no(n)]n
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Fears for T N pension after talks . Unions representing workers at Turner Newall say they are ' disappointed ' after talks with stricken parent firm Federal Mogul . unions representing workers at Turner Newall have expressed disappointment after talks with the firm's parent company, Federal Mogul. tensor([[ 3260,  6130,   351, 15406,   968,   439,    11,   791,   507, 10200,
          3259,   379, 15406,   968,   439,  6241, 18641,   287,   262,  4081,
          1222,   499,   418,    26,    82,  2560,  1664,    11,  5618, 30926,
           377,    13]])
predicting a sentence length:  32
Traceback (most recent call last):
  File "/workspace/mucoco/decode.py", line 4, in <module>
    cli_main()
  File "/workspace/mucoco/mucoco/decode.py", line 821, in cli_main
    main(args)
  File "/workspace/mucoco/mucoco/decode.py", line 565, in main
    optimizer.backward(total_batchloss, retain_graph=True, scaler=scaler) 
  File "/workspace/mucoco/mucoco/utils/optim.py", line 360, in backward
    loss.backward(retain_graph=retain_graph)
  File "/opt/conda/lib/python3.9/site-packages/torch/_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/opt/conda/lib/python3.9/site-packages/torch/autograd/__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA out of memory. Tried to allocate 7.67 GiB (GPU 0; 23.70 GiB total capacity; 15.35 GiB already allocated; 3.61 GiB free; 19.13 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

liushz commented 1 year ago

I want to know a single 3090 GPU can satisfy this model or not, in the paper I see a 2080ti is enough.

Sachin19 commented 1 year ago

Hi,

For the experiments in the paper, I generated a sequence of max length 20 and that fits fine on 2080ti. If you are generating a longer sequence, it might run out of memory. Also what model size are you using?

Sachin19 / mucoco

CUDA out of memory when using a 3090 gpu #4