haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
20.24k stars 2.24k forks source link

NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE. #464

Open 1106301825 opened 1 year ago

1106301825 commented 1 year ago

| ERROR | stderr | RuntimeError: CUDA error: device-side assert triggered | ERROR | stderr | CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. | ERROR | stderr | For debugging consider passing CUDA_LAUNCH_BLOCKING=1. | ERROR | stderr | Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

TikaToka commented 1 year ago

Have same issue while handling #417 my error's looks like this

../aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLarge Index: block: [232,0,0], thread: [31,0,0] Assertion srcIndex < srcSelectDimSize failed. ''' 2023-09-30 13:06:31 | ERROR | stderr | Traceback (most recent c all last): 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/threading.py", line 1016, in _bootstr ap_inner 2023-09-30 13:06:31 | ERROR | stderr | self.run() 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/threading.py", line 953, in run 2023-09-30 13:06:31 | ERROR | stderr | self.target(*self. args, self._kwargs) 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/torch/utils/_contextlib .py", line 115, in decorate_context 2023-09-30 13:06:31 | ERROR | stderr | return func(args, kwargs) 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/transformers/generation /utils.py", line 1588, in generate 2023-09-30 13:06:31 | ERROR | stderr | return self.sample( 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/transformers/generation /utils.py", line 2642, in sample 2023-09-30 13:06:31 | ERROR | stderr | outputs = self( 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/torch/nn/modules/module .py", line 1501, in _call_impl 2023-09-30 13:06:31 | ERROR | stderr | return forward_call( args, kwargs) 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", l ine 165, in new_forward 2023-09-30 13:06:31 | ERROR | stderr | output = old_forward (*args, kwargs) 2023-09-30 13:06:31 | ERROR | stderr | File "/home/vln_worksp ace/LLaVA/llava/model/language_model/llava_llama.py", line 78, in forward 2023-09-30 13:06:31 | ERROR | stderr | outputs = self.model ( 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/torch/nn/modules/module .py", line 1501, in _call_impl 2023-09-30 13:06:31 | ERROR | stderr | return forward_call( *args, *kwargs) 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/transformers/models/lla ma/modeling_llama.py", line 646, in forward 2023-09-30 13:06:31 | ERROR | stderr | inputs_embeds = self .embed_tokens(input_ids) 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/torch/nn/modules/module .py", line 1501, in _call_impl 2023-09-30 13:06:31 | ERROR | stderr | return forward_call( args, kwargs) 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", l ine 165, in new_forward 2023-09-30 13:06:31 | ERROR | stderr | output = old_forward (*args, **kwargs) 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/ envs/llava/lib/python3.10/site-packages/torch/nn/modules/sparse .py", line 162, in forward 2023-09-30 13:06:31 | ERROR | stderr | return F.embedding( 2023-09-30 13:06:31 | ERROR | stderr | File "/root/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding 2023-09-30 13:06:31 | ERROR | stderr | return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) 2023-09-30 13:06:31 | ERROR | stderr | RuntimeError: CUDA error: device-side assert triggered 2023-09-30 13:06:31 | ERROR | stderr | Compile with TORCH_USE_CUDA_DSA to enable device-side assertions. '''

zephirusgit commented 1 year ago

I have the same problem, but there is something to analyze, for example, I first tried to run them on Google Colab, it happened to me with the first colibs that everyone gave me that mistake, Then with the Llava_7b_8bit_colab.ipynb version There I could make it work last night, while downloading the models on my PC, although at first I could not walk until I saw another git https://github.com/natlamir/LLaVA-Windows Following those steps I could see what they should initialize in 3 different Powershells, and thus achieve that at first, lift the 3 servers, but, oh surprise I told me the same error, that some gave me some google colabs, (Another thing is that with the send he created in the other git, he could not walk, but if he added some things that he mentioned there, in the github of Haotian-liu, then there he worked in parts, Now I have that problem, which I imagine, is because when I lift the 3rd server, which I see that the complete vram occupies me, perhaps for some reason does not connect to the other 2 servers?) omg now are working ! Well as I wrote this I would regenerate to see if any error came out and started writing, Maybe you have to give a good start time. Now I am using a Ryzen 5 3600X, with 48GB of RAM, and an RTX 2060 with 12GB.

zephirusgit commented 1 year ago

anda

zephirusgit commented 1 year ago

ohh yes now is working!, I'm going to share the pip list to check your versions

zephirusgit commented 1 year ago

Package Version Editable project location


accelerate 0.21.0 aiofiles 23.2.1 aiohttp 3.8.6 aiosignal 1.3.1 altair 5.1.2 anyio 3.7.1 appdirs 1.4.4 async-timeout 4.0.3 attrs 23.1.0 bitsandbytes 0.37.5 bitsandbytes-cuda111 0.26.0.post2 Brotli 1.0.9 certifi 2023.7.22 cffi 1.15.1 chardet 5.2.0 charset-normalizer 2.0.4 click 8.1.7 colorama 0.4.6 contourpy 1.1.1 cryptography 41.0.3 cycler 0.12.1 docker-pycreds 0.4.0 einops 0.6.1 einops-exts 0.0.4 exceptiongroup 1.1.3 fastapi 0.104.0 ffmpy 0.3.1 filelock 3.12.4 fonttools 4.43.1 frozenlist 1.4.0 fsspec 2023.10.0 gitdb 4.0.11 GitPython 3.1.40 gradio 3.35.2 gradio_client 0.2.9 h11 0.14.0 httpcore 0.17.3 httpx 0.24.0 huggingface-hub 0.18.0 idna 3.4 Jinja2 3.1.2 joblib 1.3.2 jsonschema 4.19.1 jsonschema-specifications 2023.7.1 kiwisolver 1.4.5 linkify-it-py 2.0.2 llava 1.1.3 H:\ia\llava markdown-it-py 2.2.0 markdown2 2.4.10 MarkupSafe 2.1.1 matplotlib 3.8.0 mdit-py-plugins 0.3.3 mdurl 0.1.2 mkl-fft 1.3.8 mkl-random 1.2.4 mkl-service 2.4.0 mpmath 1.3.0 multidict 6.0.4 networkx 3.1 ninja 1.11.1.1 numpy 1.26.0 orjson 3.9.10 packaging 23.2 pandas 2.1.2 pathtools 0.1.2 peft 0.4.0 Pillow 10.0.1 pip 23.3 protobuf 4.24.4 psutil 5.9.6 pycparser 2.21 pydantic 1.10.9 pydub 0.25.1 Pygments 2.16.1 pyOpenSSL 23.2.0 pyparsing 3.1.1 PySocks 1.7.1 python-dateutil 2.8.2 python-multipart 0.0.6 pytz 2023.3.post1 PyYAML 6.0.1 referencing 0.30.2 regex 2023.10.3 requests 2.31.0 rpds-py 0.10.6 safetensors 0.4.0 scikit-learn 1.2.2 scipy 1.11.3 semantic-version 2.10.0 sentencepiece 0.1.99 sentry-sdk 1.32.0 setproctitle 1.3.3 setuptools 68.0.0 shortuuid 1.0.11 six 1.16.0 smmap 5.0.1 sniffio 1.3.0 starlette 0.27.0 svgwrite 1.4.3 sympy 1.11.1 threadpoolctl 3.2.0 timm 0.6.13 tokenizers 0.13.3 toolz 0.12.0 torch 2.0.1 torchaudio 2.1.0 torchvision 0.15.2 tqdm 4.66.1 transformers 4.31.0 typing_extensions 4.8.0 tzdata 2023.3 uc-micro-py 1.0.2 urllib3 1.26.18 uvicorn 0.23.2 wandb 0.15.12 wavedrom 2.0.3.post3 websockets 12.0 wheel 0.41.2 win-inet-pton 1.1.0 yarl 1.9.2 youtube-dl 2021.12.17

zephirusgit commented 1 year ago

test