Closed ImpossibleExchange closed 1 year ago
Trying a fresh install to see if this clears up the issue. Will close once my models dump and can test.
fresh install, new conda environment , same error occurring.
Okay so, will write a bit here in case anyone else runs into this issue:
Here was the problem so far as I was able to sus it out:
System is a dual Ampere gpu card system running on Manjaro 5.15 LTS kernel (the last kernel with some what stable nvidia support)
My Cuda version was 12, might have upgraded unknown to me on doing rolling updates. This caused TextGen UI to stop working and throw those runtime errors.
Solution: I installed the Cuda developers kit (11.8), and was able to get some models running again.
So anyone else running on a linux system: Run nvidia-smi and check your Cuda version. I noticed this, then decided to try the developers kit, and it was the solution to my problem.
Thank you for the update, @ImpossibleExchange. I had not seen this error before but I am glad that you found a solution.
@oobabooga No worries, thank you for all your hard work on making something nice for us to use. I might start making threads for some of the other issues I run into and find solutions for, just glad I could (maybe) help you and others out. ^__^
Same issue. I solve it by simply upgrade my transformers version from 4.32 to 4.40. My cuda version is 12.2 and gpu is a800. Hope helps anyone who see this issue.
So, was working fine for a bit yesterday, then perhaps upgraded to this latest push and here is the big string of issues, please note might be the models I am trying to run:
Traceback (most recent call last): File "/home/user/.local/lib/python3.10/site-packages/gradio/routes.py", line 374, in run_predict output = await app.get_blocks().process_api( File "/home/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1017, in process_api result = await self.call_function( File "/home/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 849, in call_function prediction = await anyio.to_thread.run_sync( File "/usr/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/usr/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/usr/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, args) File "/home/user/.local/lib/python3.10/site-packages/gradio/utils.py", line 453, in async_iteration return next(iterator) File "/run/media/user/Main_nvme/text-generation-webui/modules/text_generation.py", line 188, in generate_reply output = eval(f"shared.model.generate({', '.join(generate_params)}){cuda}")[0] File "", line 1, in
File "/home/user/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func( args, kwargs)
File "/home/user/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 1452, in generate
return self.sample(
File "/home/user/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2521, in sample
next_tokens.tile(eos_token_id_tensor.shape[0], 1).ne(eos_token_id_tensor.unsqueeze(1)).prod(dim=0)
RuntimeError: CUDA driver error: invalid argument
0%| | 0/26 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/user/.local/lib/python3.10/site-packages/gradio/routes.py", line 374, in run_predict
output = await app.get_blocks().process_api(
File "/home/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1017, in process_api
result = await self.call_function(
File "/home/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 849, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/usr/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, args)
File "/home/user/.local/lib/python3.10/site-packages/gradio/utils.py", line 453, in async_iteration
return next(iterator)
File "/run/media/user/Main_nvme/text-generation-webui/modules/text_generation.py", line 188, in generate_reply
output = eval(f"shared.model.generate({', '.join(generate_params)}){cuda}")[0]
File "", line 1, in
File "/home/user/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func( args, kwargs)
File "/home/user/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 1406, in generate
return self.greedy_search(
File "/home/user/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2252, in greedy_search
next_tokens.tile(eos_token_id_tensor.shape[0], 1).ne(eos_token_id_tensor.unsqueeze(1)).prod(dim=0)
RuntimeError: CUDA driver error: invalid argument