Strange CUDA out of memory issue

GreatGBL commented 5 months ago

Strange CUDA out of memory issue; my previous program used to run, but now it cannot run both locally and on COLAB. Code:

pip install 'automatikz[pdf] @ git+https://github.com/potamides/AutomaTikZ' !git clone https://github.com/potamides/AutomaTikZ !pip install -e AutomaTikZ[webui]

from automatikz.infer import TikzGenerator, load generate = TikzGenerator(*load("nllg/tikz-clima-7b"), stream=True)

OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB. GPU

potamides commented 5 months ago

If it has worked previously then it is indeed strange. Can you try downgrading transformers to 4.28 and/or torch to 2.0?

GreatGBL commented 5 months ago

Thank you for your response. I have tested this, and it's not a version issue, as I successfully ran the 7B model directly.

Here's the comparison between the new versions (transformers 4.41.2 and torch 2.3.1) and the old versions (transformers 4.28 and torch 2.0):

New versions: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB. GPU

Old versions: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 15.99 GiB total capacity; 5.22 GiB already allocated; 9.48 GiB free; 5.22 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The 13B model is about 40GB, and if using a non-A100 GPU, the large chunks of the model have to stay on the CPU, then load in chunks to the GPU. I suspect this step might be encountering an issue?

Previously, I was able to run the 13B model on a 16GB GPU. Could it have just been luck?

potamides commented 5 months ago

Previously, I was able to run the 13B model on a 16GB GPU. Could it have just been luck?

Did you maybe load the model with device_map="auto"? That's the way we load it in the web ui.

GreatGBL commented 5 months ago

Thank you for your reply. I would like to ask about the 'device_map="auto"' variable modification. Specifically, should this variable be forcibly added to the method in the 【class TextGenerationPipeline(Pipeline)】 under myenv\lib\python3.10\site-packages\transformers\pipelines\text_generation.py?

potamides / AutomaTikZ

Strange CUDA out of memory issue #13