oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
40.57k stars 5.31k forks source link

RuntimeError: CUDA driver error: invalid argument #184

Closed ImpossibleExchange closed 1 year ago

ImpossibleExchange commented 1 year ago

So, was working fine for a bit yesterday, then perhaps upgraded to this latest push and here is the big string of issues, please note might be the models I am trying to run:

Traceback (most recent call last): File "/home/user/.local/lib/python3.10/site-packages/gradio/routes.py", line 374, in run_predict output = await app.get_blocks().process_api( File "/home/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1017, in process_api result = await self.call_function( File "/home/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 849, in call_function prediction = await anyio.to_thread.run_sync( File "/usr/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/usr/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/usr/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, args) File "/home/user/.local/lib/python3.10/site-packages/gradio/utils.py", line 453, in async_iteration return next(iterator) File "/run/media/user/Main_nvme/text-generation-webui/modules/text_generation.py", line 188, in generate_reply output = eval(f"shared.model.generate({', '.join(generate_params)}){cuda}")[0] File "", line 1, in File "/home/user/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "/home/user/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 1452, in generate return self.sample( File "/home/user/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2521, in sample next_tokens.tile(eos_token_id_tensor.shape[0], 1).ne(eos_token_id_tensor.unsqueeze(1)).prod(dim=0) RuntimeError: CUDA driver error: invalid argument 0%| | 0/26 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/user/.local/lib/python3.10/site-packages/gradio/routes.py", line 374, in run_predict output = await app.get_blocks().process_api( File "/home/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1017, in process_api result = await self.call_function( File "/home/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 849, in call_function prediction = await anyio.to_thread.run_sync( File "/usr/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/usr/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/usr/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, args) File "/home/user/.local/lib/python3.10/site-packages/gradio/utils.py", line 453, in async_iteration return next(iterator) File "/run/media/user/Main_nvme/text-generation-webui/modules/text_generation.py", line 188, in generate_reply output = eval(f"shared.model.generate({', '.join(generate_params)}){cuda}")[0] File "", line 1, in File "/home/user/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "/home/user/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 1406, in generate return self.greedy_search( File "/home/user/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2252, in greedy_search next_tokens.tile(eos_token_id_tensor.shape[0], 1).ne(eos_token_id_tensor.unsqueeze(1)).prod(dim=0) RuntimeError: CUDA driver error: invalid argument

ImpossibleExchange commented 1 year ago

Trying a fresh install to see if this clears up the issue. Will close once my models dump and can test.

ImpossibleExchange commented 1 year ago

fresh install, new conda environment , same error occurring.

ImpossibleExchange commented 1 year ago

Okay so, will write a bit here in case anyone else runs into this issue:

Here was the problem so far as I was able to sus it out:

System is a dual Ampere gpu card system running on Manjaro 5.15 LTS kernel (the last kernel with some what stable nvidia support)

My Cuda version was 12, might have upgraded unknown to me on doing rolling updates. This caused TextGen UI to stop working and throw those runtime errors.

Solution: I installed the Cuda developers kit (11.8), and was able to get some models running again.

So anyone else running on a linux system: Run nvidia-smi and check your Cuda version. I noticed this, then decided to try the developers kit, and it was the solution to my problem.

oobabooga commented 1 year ago

Thank you for the update, @ImpossibleExchange. I had not seen this error before but I am glad that you found a solution.

ImpossibleExchange commented 1 year ago

@oobabooga No worries, thank you for all your hard work on making something nice for us to use. I might start making threads for some of the other issues I run into and find solutions for, just glad I could (maybe) help you and others out. ^__^

JasonCZH4 commented 6 months ago

Same issue. I solve it by simply upgrade my transformers version from 4.32 to 4.40. My cuda version is 12.2 and gpu is a800. Hope helps anyone who see this issue.