Closed xlxxcc closed 3 weeks ago
This error is due to an overflow in Softmax.
You could first check whether the downloaded model is complete. If this error still occurs, consider doing a greedy search during generation or lowering the temperature.
Doing a greedy search: https://github.com/PKU-YuanGroup/Chat-UniVi/blob/910a9f60ef839c959dbd46b044c33726f21b32da/ChatUniVi/demo.py#L99
do_sample=False
or lowering the temperature: https://github.com/PKU-YuanGroup/Chat-UniVi/blob/910a9f60ef839c959dbd46b044c33726f21b32da/ChatUniVi/demo.py#L87
temperature = 0.1
Overflow is also related to the model quantization. If the above methods do not work, you could consider using bfloat16.
ths,the above methods work for me
uvicorn main_demo_7B:app --host 0.0.0.0 --port 9999
运行环境:
python -c 'import torch; print(torch.version.cuda)'
): 12.1Traceback (most recent call last): File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/gradio/routes.py", line 437, in run_predict output = await app.get_blocks().process_api( File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/gradio/blocks.py", line 1352, in process_api result = await self.call_function( File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/gradio/blocks.py", line 1077, in call_function prediction = await anyio.to_thread.run_sync( File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread return await future File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run result = context.run(func, args) File "/data/Chat-UniVi/main_demo_7B.py", line 76, in generate text_enout, state = handler.generate(images_tensor, text_en_in, first_run=firstrun, state=state) File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(args, *kwargs) File "/data/Chat-UniVi/ChatUniVi/demo.py", line 96, in generate output_ids = model.generate( File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(args, **kwargs) File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate return self.sample( File "/data/miniconda3/envs/chatunivi/lib/python3.10/site-packages/transformers/generation/utils.py", line 2678, in sample next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1) RuntimeError: probability tensor contains either
inf
,nan
or element < 0