CUDA device, out of memory, how to avoid ?

glorieux-f commented 2 years ago

For a quite big Greek corpus (Galen), I encounter an out out of memory error from PyTorch on some chapters (just a few, not all) There is an easy workaround, do as much as possible with cuda option, and finish with cpu. But what could be done to be more straight ?

The pre-tokenized verticalized text is attached. tlg0057.tlg077.1st1K-grc1.3.2.txt Reading the code, it seems that two \n\n is used as a sentence separator, no sentence seems too huge.

Thanks.

The exception

File "/usr/local/lib/python3.9/dist-packages/pie_extended/tagger.py", line 41, in tag_str return list(self.iter_tag_token(data, iterator, processor=processor, no_tokenizer=no_tokenizer)) File "/usr/local/lib/python3.9/dist-packages/pie_extended/tagger.py", line 58, in iter_tag_token tagged, tasks = self.tag( File "/usr/local/lib/python3.9/dist-packages/pie/tagger.py", line 132, in tag preds = model.predict(inp, *tasks, *kwargs) File "/usr/local/lib/python3.9/dist-packages/pie/models/model.py", line 314, in predict hyps, prob = decoder.predict_max( File "/usr/local/lib/python3.9/dist-packages/pie/models/decoder.py", line 358, in predict_max outs, hidden = self.rnn(emb, hidden) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/rnn.py", line 581, in forward result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers, RuntimeError: CUDA out of memory. Tried to allocate 444.00 MiB (GPU 0; 3.95 GiB total capacity; 1.90 GiB already allocated; 319.44 MiB free; 2.19 GiB reserved in total by PyTorch)

PonteIneptique commented 2 years ago

This issue might lie in the batch_size: there might be few batches (number of sentences tagged at the same time) where the overall sentences length are too big. Given what I read in terms of missing memory, halving the batch_size should suffice :)

If you use https://github.com/hipster-philology/nlp-pie-taggers/blob/735ea9157f81191d5cc75912cc13f09db5636bfb/pie_extended/cli/utils.py#L71 , try 8

If you use ExtensibleTagger directly, it has a different default of 100, I'd try 64 then.

glorieux-f commented 2 years ago

Confirmed, and reproductable, issue could be closed. I had 256, too much, it works with 64. There is probably a sweet spot after which bigger is not faster, but it could be different for al machines, so... 64 is good for me.

hipster-philology / nlp-pie-taggers

CUDA device, out of memory, how to avoid ? #39