deep-floyd / IF

Other
7.66k stars 496 forks source link

Kernel crash on loading model in Ubuntu 22.04 #49

Open vinaysingh8866 opened 1 year ago

vinaysingh8866 commented 1 year ago

Hey, I'm trying to load the model into 24GB VRAM GPU.

This is my code from diffusers import DiffusionPipeline from diffusers.utils import pt_to_pil import torch

stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-XL-v1.0", torch_dtype=torch.float16) stage_1.enable_xformers_memory_efficient_attention() stage_1.enable_model_cpu_offload()

The kernel crashes while loading the model into the memory, I tried loading from deepfloyd_if same thing it also crashes while running the following code. from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII from deepfloyd_if.modules.t5 import T5Embedder

device = 'cuda:0' if_I = IFStageI('IF-I-XL-v1.0', device=device) if_II = IFStageII('IF-II-L-v1.0', device=device) if_III = StableStageIII('stable-diffusion-x4-upscaler', device=device) t5 = T5Embedder(device="cpu")

This is the error shown in the notebook, Canceled future for execute_request message before replies were done The Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click here for more info. View Jupyter log for further details.

I tracked memory usage it is not passing 14GB mark, how do I resolve it?

brycedrennan commented 1 year ago

please provide the "more info" referenced in the error message

vinaysingh8866 commented 1 year ago

The log is pasted below info 18:00:08.165: Process Execution: > ~/floyd/.conda/bin/python -c "import ipykernel; print(ipykernel.version); print("5dc3a68c-e34e-4080-9c3e-2a532b2ccb4d"); print(ipykernel.file)"

~/floyd/.conda/bin/python -c "import ipykernel; print(ipykernel.version); print("5dc3a68c-e34e-4080-9c3e-2a532b2ccb4d"); print(ipykernel.file)" info 18:00:08.202: Process Execution: > ~/floyd/.conda/bin/python -m ipykernel_launcher --ip=127.0.0.1 --stdin=9003 --control=9001 --hb=9000 --Session.signature_scheme="hmac-sha256" --Session.key=b"1c81a2d0-ca31-4e1d-aac0-968179c8dbcb" --shell=9002 --transport="tcp" --iopub=9004 --f=/home/vinay/.local/share/jupyter/runtime/kernel-v2-7335xL6PqB389I5C.json ~/floyd/.conda/bin/python -m ipykernel_launcher --ip=127.0.0.1 --stdin=9003 --control=9001 --hb=9000 --Session.signature_scheme="hmac-sha256" --Session.key=b"1c81a2d0-ca31-4e1d-aac0-968179c8dbcb" --shell=9002 --transport="tcp" --iopub=9004 --f=/home/vinay/.local/share/jupyter/runtime/kernel-v2-7335xL6PqB389I5C.json info 18:00:08.202: Process Execution: cwd: ~/floyd cwd: ~/floyd info 18:00:08.503: ipykernel version & path 6.15.0, ~/floyd/.conda/lib/python3.10/site-packages/ipykernel/init.py for /home/vinay/floyd/.conda/bin/python info 18:00:09.281: ZMQ loaded via fallback mechanism. info 18:00:09.322: Got new session 2335260b-cb88-4d60-a3df-8541ee408777 info 18:00:09.322: Started new restart session error 18:02:54.121: Disposing session as kernel process died ExitCode: undefined, Reason: /home/vinay/floyd/.conda/lib/python3.10/site-packages/traitlets/traitlets.py:2548: FutureWarning: Supporting extra quotes around strings is deprecated in traitlets 5.0. You can use 'hmac-sha256' instead of '"hmac-sha256"' if you require traitlets >=5. warn( /home/vinay/floyd/.conda/lib/python3.10/site-packages/traitlets/traitlets.py:2499: FutureWarning: Supporting extra quotes around Bytes is deprecated in traitlets 5.0. Use '1c81a2d0-ca31-4e1d-aac0-968179c8dbcb' instead of 'b"1c81a2d0-ca31-4e1d-aac0-968179c8dbcb"'. warn(

info 18:02:54.259: Dispose Kernel process 8755. error 18:02:54.308: Raw kernel process exited code: undefined error 18:02:55.244: Error in waiting for cell to complete Error: Canceled future for execute_request message before replies were done at t.KernelShellFutureHandler.dispose (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:32419) at /home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:51471 at Map.forEach () at y._clearKernelState (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:51456) at y.dispose (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:44938) at /home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:17:96826 at ee (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:1589492) at jh.dispose (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:17:96802) at Lh.dispose (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:17:104079) at processTicksAndRejections (node:internal/process/task_queues:96:5) warn 18:02:55.333: Cell completed with errors { message: 'Canceled future for execute_request message before replies were done' }

These are all the packages accelerate==0.15.0 antlr4-python3-runtime==4.9.3 asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1670263926556/work backcall @ file:///home/conda/feedstock_root/build_artifacts/backcall_1592338393461/work backports.functools-lru-cache @ file:///home/conda/feedstock_root/build_artifacts/backports.functools_lru_cache_1618230623929/work beautifulsoup4==4.11.2 certifi==2022.12.7 charset-normalizer==3.1.0 clip @ git+https://github.com/openai/CLIP.git@a9b1bf5920416aaeaec965c25dd9e8f98c864f16 cmake==3.26.3 contourpy==1.0.7 cycler==0.11.0 debugpy @ file:///home/builder/ci_310/debugpy_1640789504635/work decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1641555617451/work deepfloyd-if==1.0.1 diffusers==0.16.1 entrypoints @ file:///home/conda/feedstock_root/build_artifacts/entrypoints_1643888246732/work executing @ file:///home/conda/feedstock_root/build_artifacts/executing_1667317341051/work filelock==3.12.0 fonttools==4.39.3 fsspec==2023.4.0 ftfy==6.1.1 huggingface-hub==0.14.1 idna==3.4 importlib-metadata==6.6.0 ipykernel @ file:///home/conda/feedstock_root/build_artifacts/ipykernel_1655369107642/work ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1682709228762/work ipywidgets==8.0.6 jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1669134318875/work jupyter-client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1654730843242/work jupyter_core @ file:///home/conda/feedstock_root/build_artifacts/jupyter_core_1678994169527/work jupyterlab-widgets==3.0.7 kiwisolver==1.4.4 lit==16.0.2 matplotlib==3.7.1 matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1660814786464/work mypy-extensions==1.0.0 nest-asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1664684991461/work numpy==1.24.3 nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 omegaconf==2.3.0 packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1681337016113/work parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1638334955874/work pexpect @ file:///home/conda/feedstock_root/build_artifacts/pexpect_1667297516076/work pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1602536217715/work Pillow==9.5.0 platformdirs @ file:///home/conda/feedstock_root/build_artifacts/platformdirs_1682644429438/work prompt-toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1677600924538/work protobuf==3.20.0 psutil @ file:///opt/conda/conda-bld/psutil_1656431268089/work ptyprocess @ file:///home/conda/feedstock_root/build_artifacts/ptyprocess_1609419310487/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl pure-eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1642875951954/work Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1681904169130/work pyparsing==3.0.9 pyre-extensions==0.0.23 python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/python-dateutil_1626286286081/work PyYAML==6.0 pyzmq @ file:///croot/pyzmq_1682697643292/work regex==2023.3.23 requests==2.29.0 sentencepiece==0.1.98 six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work soupsieve==2.4.1 stack-data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1669632077133/work tokenizers==0.13.3 torch==1.13.1 torchvision==0.14.1 tornado @ file:///home/conda/feedstock_root/build_artifacts/tornado_1648827254365/work tqdm==4.65.0 traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1675110562325/work transformers==4.25.1 triton==2.0.0.post1 typing-inspect==0.8.0 typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1678559861143/work urllib3==1.26.15 wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1673864653149/work widgetsnbextension==4.0.7 xformers==0.0.16 zipp==3.15.0 Note: you may need to restart the kernel to use updated packages.

vinaysingh8866 commented 1 year ago

I tried running the colab locally and it throws

A: torch.Size([77, 4096]), B: torch.Size([4096, 4096]), C: (77, 4096); (lda, ldb, ldc): (c_int(2464), c_int(131072), c_int(2464)); (m, n, k): (c_int(77), c_int(4096), c_int(4096))cuBLAS API failed with status 15 error detected Output exceeds the size limit. Open the full output data in a text editor--------------------------------------------------------------------------- Exception Traceback (most recent call last) Cell In[10], line 1 ----> 1 prompt_embeds, negative_embeds = pipe.encode_prompt(prompt)

File ~/floyd/.conda/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, kwargs) 112 @functools.wraps(func) 113 def decorate_context(*args, *kwargs): 114 with ctx_factory(): --> 115 return func(args, kwargs)

File ~/floyd/.conda/lib/python3.10/site-packages/diffusers/pipelines/deepfloyd_if/pipeline_if.py:324, in IFPipeline.encode_prompt(self, prompt, do_classifier_free_guidance, num_images_per_prompt, device, negative_prompt, prompt_embeds, negative_prompt_embeds, clean_caption) 317 logger.warning( 318 "The following part of your input was truncated because CLIP can only handle sequences up to" 319 f" {max_length} tokens: {removed_text}" 320 ) 322 attention_mask = text_inputs.attention_mask.to(device) --> 324 prompt_embeds = self.text_encoder( 325 text_input_ids.to(device), 326 attention_mask=attention_mask, 327 ) 328 prompt_embeds = prompt_embeds[0] 330 if self.text_encoder is not None:

File ~/floyd/.conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs) ... -> 1436 raise Exception('cublasLt ran into an error!') 1438 torch.cuda.set_device(prev_device) 1440 return out, Sout

Exception: cublasLt ran into an error!

While running prompt_embeds, negative_embeds = pipe.encode_prompt(prompt)