mberr / torch-max-mem

Decorators for maximizing memory utilization with PyTorch & CUDA
https://torch-max-mem.readthedocs.io/en/latest/
MIT License
14 stars 0 forks source link

Extend CUDA OOM Handling #11

Closed mberr closed 1 year ago

mberr commented 1 year ago

This PR handles a few additional errors as torch.cuda.OutOfMemoryError, cf. https://github.com/pykeen/pykeen/pull/279

cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

cf. https://discuss.pytorch.org/t/cudnn-status-not-supported-this-error-may-appear-if-you-passed-in-a-non-contiguous-input/

CUDA out of memory.

(this was the torch < 2.0 way of OOM errors)

nonzero is not supported for tensors with more than INT_MAX elements

cf. https://github.com/pytorch/pytorch/issues/51871