stanfordnlp / stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
https://stanfordnlp.github.io/stanza/
Other
7.14k stars 880 forks source link

Italian model runs out of memory on COLAB A100 gpu #1370

Closed lucaducceschi closed 2 months ago

lucaducceschi commented 3 months ago

Trying to process small texts (300-500kb) on a 40gb GPU on colab raises an OutOfMemoryError. Here is log. The English model, on the same text, does not. It happens even with processors='tokenize,lemma,pos'.


OutOfMemoryError Traceback (most recent call last)

in () 1 s = open('KINGSLEY_TIASPETTOACENTRALPARL.txt').read() ----> 2 doc = nlp(s) 11 frames /usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py in forward(self, input) 114 115 def forward(self, input: Tensor) -> Tensor: --> 116 return F.linear(input, self.weight, self.bias) 117 118 def extra_repr(self) -> str: OutOfMemoryError: CUDA out of memory. Tried to allocate 9.50 GiB. GPU 0 has a total capacity of 15.77 GiB of which 5.42 GiB is free. Process 3295 has 10.35 GiB memory in use. Of the allocated memory 9.93 GiB is allocated by PyTorch, and 37.90 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
AngledLuffa commented 3 months ago

Hmm, the frames elided would tell me which particular module you were in at the time of OOM. There's also the issue that it seems to be 16G instead of 40G:

Tried to allocate 9.50 GiB. GPU 0 has a total capacity of 15.77 GiB of which 5.42 GiB is free

but Stanza should definitely fit on that anyway. Is there a particularly long sentence or token?

lucaducceschi commented 3 months ago

Hi, thanks for the repoly.

My bad: I tried with a 40gb instance, but Colab switched to a 16gb version do to availability reasons. In any case, the model should run with 16gb as well. As for the length of sentences, the longest is 631 characters, while the longer token is 23 characters. It shouldn't be an issue, I believe. Also, I get the same result with other similar texts.

The tokenizer and lemmatizer work, but if I try to run the pos-tagger I get an error. Is there any other info that I can provide? The notebook? The data?

Thanks again

AngledLuffa commented 3 months ago

Sure, the data would be quite helpful. I'll take a look and see if there's something weird going on. Do you have multiple pipelines loaded at once or some other possible unexpected use case? I would think that a sentence of only 631 characters would work just fine on a 16GB GPU.

lucaducceschi commented 3 months ago

Different text, same result (but this book is not copyrighted, so I can attach it). There is only one active pipeline. Colab has been acting up lately, so it could also be that there is a problem on their end (even if I don't understand how this should affect a specific model).

pirandello_il_fu_mattia_pascal.txt

Screenshot 2024-03-22 at 17 44 28

Here is the error message:


OutOfMemoryError Traceback (most recent call last) in <cell line: 4>() 2 s = open('archive/pirandello_il_fu_mattia_pascal.txt').read() 3 ----> 4 doc = nlp(s)

9 frames /usr/local/lib/python3.10/dist-packages/torch/nn/utils/rnn.py in pad_packed_sequence(sequence, batch_first, padding_value, total_length) 331 ) 332 max_seq_length = total_length --> 333 padded_output, lengths = _VF._pad_packed_sequence( 334 sequence.data, sequence.batch_sizes, batch_first, padding_value, max_seq_length) 335 unsorted_indices = sequence.unsorted_indices

OutOfMemoryError: CUDA out of memory. Tried to allocate 18.10 GiB. GPU 0 has a total capacity of 15.77 GiB of which 10.72 GiB is free. Process 15062 has 5.05 GiB memory in use. Of the allocated memory 4.52 GiB is allocated by PyTorch, and 154.43 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

AngledLuffa commented 3 months ago

It would seem the default batch size for some of the POS models was not set to something reasonable when we changed the batching scheme. I'll get to work updating the models to reflect something that actually fits in a GPU, and in the meantime, you can use the parameter pos_batch_size=100 when creating the Pipeline to avoid this happening

On Fri, Mar 22, 2024 at 9:46 AM Luca @.***> wrote:

Different text, same result (but this book is not copyrighted, so I can attach it). There is only one active pipeline. Colab has been acting up lately, so it could also be that there is a problem on their end (even if I don't understand how this should affect a specific model).

pirandello_il_fu_mattia_pascal.txt https://github.com/stanfordnlp/stanza/files/14725863/pirandello_il_fu_mattia_pascal.txt Screenshot.2024-03-22.at.17.44.28.png (view on web) https://github.com/stanfordnlp/stanza/assets/33907162/95cfb283-c5f8-48b8-86c9-357d7a6f7855

Here is the error message:

OutOfMemoryError Traceback (most recent call last) https://localhost:8080/# in <cell line: 4>() 2 s = open('archive/pirandello_il_fu_mattia_pascal.txt').read() 3 ----> 4 doc = nlp(s)

9 frames /usr/local/lib/python3.10/dist-packages/torch/nn/utils/rnn.py https://localhost:8080/# in pad_packed_sequence(sequence, batch_first, padding_value, total_length) 331 ) 332 max_seq_length = total_length --> 333 padded_output, lengths = _VF._pad_packed_sequence( 334 sequence.data, sequence.batch_sizes, batch_first, padding_value, max_seq_length) 335 unsorted_indices = sequence.unsorted_indices

OutOfMemoryError: CUDA out of memory. Tried to allocate 18.10 GiB. GPU 0 has a total capacity of 15.77 GiB of which 10.72 GiB is free. Process 15062 has 5.05 GiB memory in use. Of the allocated memory 4.52 GiB is allocated by PyTorch, and 154.43 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management ( https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

— Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/stanza/issues/1370#issuecomment-2015491322, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWK4QNZWVAUFET2N7DTYZRN5XAVCNFSM6AAAAABE5HBBLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJVGQ4TCMZSGI . You are receiving this because you commented.Message ID: @.***>

lucaducceschi commented 3 months ago

Yes, that solved the issue. I feel stupid for not thinking about it, but that's a lesson learned. Thanks a lot.

AngledLuffa commented 2 months ago

This is now part of the 1.8.2 release