NVIDIA / Stable-Diffusion-WebUI-TensorRT

TensorRT Extension for Stable Diffusion Web UI
MIT License
1.9k stars 144 forks source link

Error installing in Automatic1111 #12

Closed Jonseed closed 10 months ago

Jonseed commented 11 months ago

Here is the error in the console:

Error running install.py for extension D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT.
*** Command: "d:\repos\stable-diffusion-webui\venv\Scripts\python.exe" "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\install.py"
*** Error code: 1
*** stdout: Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com
*** Collecting tensorrt==9.0.1.post11.dev4
***   Downloading https://pypi.nvidia.com/tensorrt/tensorrt-9.0.1.post11.dev4.tar.gz (18 kB)
***   Preparing metadata (setup.py): started
***   Preparing metadata (setup.py): finished with status 'done'
*** Building wheels for collected packages: tensorrt
***   Building wheel for tensorrt (setup.py): started
***   Building wheel for tensorrt (setup.py): still running...
***   Building wheel for tensorrt (setup.py): finished with status 'done'
***   Created wheel for tensorrt: filename=tensorrt-9.0.1.post11.dev4-py2.py3-none-any.whl size=17618 sha256=e059e2b3b7dd7ecf4c805ab6f2b4589ddb43b0959bfa66178fa0d01559ba1ef8
***   Stored in directory: c:\users\X\appdata\local\pip\cache\wheels\d1\6d\71\f679d0d23a60523f9a05445e269bfd0bcd1c5272097fa931df
*** Successfully built tensorrt
*** Installing collected packages: tensorrt
*** Successfully installed tensorrt-9.0.1.post11.dev4
*** Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
*** Collecting polygraphy
***   Downloading polygraphy-0.49.0-py2.py3-none-any.whl (327 kB)
***      -------------------------------------- 327.9/327.9 kB 4.1 MB/s eta 0:00:00
*** Installing collected packages: polygraphy
*** Successfully installed polygraphy-0.49.0
*** Collecting protobuf==3.20.2
***   Downloading protobuf-3.20.2-cp310-cp310-win_amd64.whl (904 kB)
***      -------------------------------------- 904.0/904.0 kB 4.4 MB/s eta 0:00:00
*** Installing collected packages: protobuf
***   Attempting uninstall: protobuf
***     Found existing installation: protobuf 3.20.0
***     Uninstalling protobuf-3.20.0:
***       Successfully uninstalled protobuf-3.20.0
*** TensorRT is not installed! Installing...
*** Installing nvidia-cudnn-cu11
*** Installing tensorrt
*** removing nvidia-cudnn-cu11
*** Polygraphy is not installed! Installing...
*** Installing polygraphy
*** GS is not installed! Installing...
*** Installing protobuf
***
*** stderr: A matching Triton is not available, some optimizations will not be enabled.
*** Error caught was: No module named 'triton'
*** d:\repos\stable-diffusion-webui\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
***   rank_zero_deprecation(
***
*** [notice] A new release of pip available: 22.2.1 -> 23.2.1
*** [notice] To update, run: d:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip
***
*** [notice] A new release of pip available: 22.2.1 -> 23.2.1
*** [notice] To update, run: d:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip
*** ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'D:\\repos\\stable-diffusion-webui\\venv\\Lib\\site-packages\\google\\~rotobuf\\internal\\_api_implementation.cp310-win_amd64.pyd'
*** Check the permissions.
***
***
*** [notice] A new release of pip available: 22.2.1 -> 23.2.1
*** [notice] To update, run: d:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip
*** Traceback (most recent call last):
***   File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\install.py", line 30, in <module>***     install()
***   File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\install.py", line 19, in install
***     launch.run_pip("install protobuf==3.20.2", "protobuf", live=True)
***   File "d:\repos\stable-diffusion-webui\modules\launch_utils.py", line 138, in run_pip
***     return run(f'"{python}" -m pip {command} --prefer-binary{index_url_line}', desc=f"Installing {desc}", errdesc=f"Couldn't install {desc}", live=live)
***   File "d:\repos\stable-diffusion-webui\modules\launch_utils.py", line 115, in run
***     raise RuntimeError("\n".join(error_bits))
*** RuntimeError: Couldn't install protobuf.
*** Command: "d:\repos\stable-diffusion-webui\venv\Scripts\python.exe" -m pip install protobuf==3.20.2 --prefer-binary
*** Error code: 1

And then when I restarted the webui, I got these popups:

Screenshot 2023-10-17 102416 Screenshot 2023-10-17 102706 Screenshot 2023-10-17 102950 Screenshot 2023-10-17 103000

What does that mean?

4lt3r3go commented 11 months ago

same issue here

Jonseed commented 11 months ago

And then when I try to export the default engine, I get this error in the console:

{'sample': [(1, 4, 64, 64), (2, 4, 64, 64), (8, 4, 96, 96)], 'timesteps': [(1,), (2,), (8,)], 'encoder_hidden_states': [(1, 77, 768), (2, 77, 768), (8, 154, 768)]}
Disabling attention optimization
============= Diagnostic Run torch.onnx.export version 2.0.1+cu118 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

ERROR:root:Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
Traceback (most recent call last):
  File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 84, in export_onnx
    torch.onnx.export(
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\onnx\utils.py", line 506, in export
    _export(
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\onnx\utils.py", line 1548, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\onnx\utils.py", line 1113, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\onnx\utils.py", line 989, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\onnx\utils.py", line 893, in _trace_and_get_graph_from_model
    trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\jit\_trace.py", line 1268, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\jit\_trace.py", line 127, in forward
    graph, out = torch._C._create_graph_by_tracing(
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\jit\_trace.py", line 118, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1488, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "D:\repos\stable-diffusion-webui\modules\sd_unet.py", line 91, in UNetModel_forward
    return ldm.modules.diffusionmodules.openaimodel.copy_of_UNetModel_forward_for_webui(self, x, timesteps, context, *args, **kwargs)
  File "D:\repos\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py", line 789, in forward
    emb = self.time_embed(t_emb)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1488, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
    input = module(input)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1488, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "D:\repos\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 429, in network_Linear_forward
    return originals.Linear_forward(self, input)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1117, in call_function
    prediction = await utils.async_iteration(iterator)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 350, in async_iteration
    return await iterator.__anext__()
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 343, in __anext__
    return await anyio.to_thread.run_sync(
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 326, in run_sync_iterator_async
    return next(iterator)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 695, in gen_wrapper
    yield from f(*args, **kwargs)
  File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 154, in export_unet_to_trt
    export_onnx(
  File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 129, in export_onnx
    exit()
  File "C:\Users\X\AppData\Local\Programs\Python\Python310\lib\_sitebuiltins.py", line 26, in __call__
    raise SystemExit(code)
SystemExit: None
MorkTheOrk commented 11 months ago

Hello!

What packages have you installed? Anything updated from auto1111 default packages? Your 2nd error might be due to the failed installation.

Jonseed commented 11 months ago

Here are my other installed extensions. Pretty standard stuff. Otherwise, it is all default. (I also already updated my NVIDIA drivers to 537.58 too. I have a RTX 3060.)

Screenshot 2023-10-17 104608

Jonseed commented 11 months ago

I should mention I'm on the latest version of Auto1111: version: v1.6.0 python: 3.10.6 torch: 2.0.1+cu118 xformers: 0.0.20 gradio: 3.41.2 checkpoint: 1240e811e2

MorkTheOrk commented 11 months ago

We also used the lastest version of Auto1111.

Does the first error occur again when you restart the webui? Have you tried manually installing protobuf?

Jonseed commented 11 months ago

If I restart the webui, I get the popups again, and this in the console:

Requirement already satisfied: protobuf==3.20.2 in .\venv\lib\site-packages (3.20.2)
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: onnx-graphsurgeon in .\venv\lib\site-packages (0.3.27)
Requirement already satisfied: onnx in .\venv\lib\site-packages (from onnx-graphsurgeon) (1.14.1)
Requirement already satisfied: numpy in .\venv\lib\site-packages (from onnx-graphsurgeon) (1.23.5)
Requirement already satisfied: protobuf>=3.20.2 in .\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (3.20.2)
Requirement already satisfied: typing-extensions>=3.6.2.1 in .\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (4.5.0)
GS is not installed! Installing...
Installing protobuf
Installing onnx-graphsurgeon
UI Config not initialized

I have not tried manually installing protobuf.

MorkTheOrk commented 11 months ago

You can see the packages that a required and how to install them in the install.py in the root dir.

The pop up errors are due to PyTorch having their own cudnn libraries, but TensorRT requires them aswell and installs them too. We remove that package after installing the tensorrt wheel. see here: https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT/blob/d8eee382158cd7373e44fe7ac9265aa985a8fd29/install.py#L5C5-L10C1

Jonseed commented 11 months ago

I shouldn't have to install anything manually... Why are there separate cudnn libraries? They seem to be conflicting... Or did TensorRT fail to remove the package after installing the tensorrt wheel?

Jonseed commented 11 months ago

Should I delete the extension and try to install it again?

MorkTheOrk commented 11 months ago

No you shouldn't, that's why we have the install.py that auto1111 runs. It might have failed to uninstall... Can you also sent me a pip freeze ?

MorkTheOrk commented 11 months ago

Should I delete the extension and try to install it again?

You can, maybe it makes a difference.

Jonseed commented 11 months ago

ok, figured out the pip freeze.

absl-py==1.4.0
accelerate==0.21.0
addict==2.4.0
aenum==3.1.12
aiofiles==23.2.1
aiohttp==3.8.4
aiosignal==1.3.1
altair==5.0.0
antlr4-python3-runtime==4.9.3
anyio==3.6.2
async-timeout==4.0.2
attrs==23.1.0
basicsr==1.4.2
beautifulsoup4==4.12.2
blendmodes==2022
blis==0.7.9
boltons==23.0.0
Brotli==1.1.0
cachetools==5.3.0
catalogue==2.0.8
certifi==2023.5.7
cffi==1.15.1
chardet==4.0.0
charset-normalizer==3.1.0
clean-fid==0.1.35
click==8.1.7
clip @ git+https://github.com/openai/CLIP.git@d50d76daa670286dd6cacf3bcd80b5e4823fc8e1
colorama==0.4.6
confection==0.0.4
contourpy==1.0.7
cssselect2==0.7.0
cycler==0.11.0
cymem==2.0.7
deprecation==2.1.0
duckduckgo-search==3.9.3
dynamicprompts==0.29.0
einops==0.4.1
facexlib==0.3.0
fastapi==0.94.0
ffmpy==0.3.0
filelock==3.12.0
filterpy==1.4.5
flatbuffers==23.5.9
font-roboto==0.0.1
fonts==0.0.3
fonttools==4.39.4
freetype-py==2.3.0
frozenlist==1.3.3
fsspec==2023.5.0
ftfy==6.1.1
future==0.18.3
fvcore==0.1.5.post20221221
gdown==4.7.1
gfpgan==1.3.8
gitdb==4.0.10
GitPython==3.1.32
google-auth==2.18.1
google-auth-oauthlib==1.0.0
gradio==3.41.2
gradio_client==0.5.0
grpcio==1.54.2
h11==0.12.0
h2==4.1.0
hpack==4.0.0
httpcore==0.15.0
httpx==0.24.1
huggingface-hub==0.14.1
hyperframe==6.0.1
idna==2.10
imageio==2.28.1
importlib-resources==6.0.1
inflection==0.5.1
iopath==0.1.9
Jinja2==3.1.2
jsonmerge==1.8.0
jsonschema==4.17.3
kiwisolver==1.4.4
kornia==0.6.7
langcodes==3.3.0
lark==1.1.2
lazy_loader==0.2
lightning-utilities==0.8.0
linkify-it-py==2.0.2
llvmlite==0.40.0
lmdb==1.4.1
lpips==0.1.4
lxml==4.9.3
Markdown==3.4.3
markdown-it-py==2.2.0
MarkupSafe==2.1.2
matplotlib==3.7.1
mdit-py-plugins==0.3.3
mdurl==0.1.2
mediapipe==0.10.5
mpmath==1.3.0
multidict==6.0.4
murmurhash==1.0.9
networkx==3.1
numba==0.57.0
numpy==1.23.5
nvidia-cublas-cu11==11.11.3.6
nvidia-cuda-nvrtc-cu11==11.8.89
nvidia-cuda-runtime-cu11==11.8.89
nvidia-cudnn-cu11==8.9.4.25
oauthlib==3.2.2
omegaconf==2.2.3
onnx==1.14.1
onnx-graphsurgeon==0.3.27
open-clip-torch==2.20.0
opencv-contrib-python==4.7.0.72
opencv-python==4.8.0.76
orjson==3.8.12
packaging==23.1
pandas==2.0.1
pathy==0.10.1
piexif==1.1.3
Pillow==9.5.0
polygraphy==0.49.0
portalocker==2.7.0
preshed==3.0.8
protobuf==3.20.2
psutil==5.9.5
py-cpuinfo==9.0.0
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycairo==1.23.0
pycparser==2.21
pydantic==1.10.7
pydub==0.25.1
Pygments==2.15.1
pyparsing==3.0.9
pyrsistent==0.19.3
PySocks==1.7.1
python-dateutil==2.8.2
python-multipart==0.0.6
pytorch-lightning==1.9.4
pytz==2023.3
PyWavelets==1.4.1
pywin32==306
PyYAML==6.0
realesrgan==0.3.0
regex==2023.5.5
reportlab==4.0.0
requests==2.31.0
requests-oauthlib==1.3.1
resize-right==0.0.2
rich==13.6.0
rlPyCairo==0.2.0
rsa==4.9
safetensors==0.3.1
scikit-image==0.21.0
scipy==1.10.1
seaborn==0.13.0
semantic-version==2.10.0
Send2Trash==1.8.0
sentencepiece==0.1.99
six==1.16.0
smart-open==6.3.0
smmap==5.0.0
sniffio==1.3.0
socksio==1.0.0
sounddevice==0.4.6
soupsieve==2.4.1
spacy==3.5.3
spacy-legacy==3.0.12
spacy-loggers==1.0.4
srsly==2.4.6
starlette==0.26.1
svglib==1.5.1
sympy==1.12
tabulate==0.9.0
tb-nightly==2.14.0a20230520
tensorboard-data-server==0.7.0
tensorrt==9.0.1.post11.dev4
tensorrt-bindings==9.0.1.post11.dev4
tensorrt-libs==9.0.1.post11.dev4
termcolor==2.3.0
thinc==8.1.10
thop==0.1.1.post2209072238
tifffile==2023.4.12
timm==0.9.2
tinycss2==1.2.1
tokenizers==0.13.3
tomesd==0.1.3
tomli==2.0.1
toolz==0.12.0
torch==2.0.1+cu118
torchdiffeq==0.2.3
torchmetrics==0.11.4
torchsde==0.2.5
torchvision==0.15.2+cu118
tqdm==4.65.0
trampoline==0.1.2
transformers==4.30.2
typer==0.7.0
typing_extensions==4.5.0
tzdata==2023.3
uc-micro-py==1.0.2
ultralytics==8.0.195
urllib3==1.26.15
uvicorn==0.22.0
wasabi==1.1.1
wcwidth==0.2.6
webencodings==0.5.1
websockets==11.0.3
Werkzeug==2.3.4
xformers==0.0.20
yacs==0.1.8
yapf==0.33.0
yarl==1.9.2
Altrue commented 11 months ago

I have the same problem. Same error messages upon restarting, same console. I installed automatic 1.52 since it says 1.5 is supported, not 1.6. RTX 3090. EDIT: Turns out Automatic 1.5 instead of 1.6 is not relevant.

Jonseed commented 11 months ago

Ok, I deleted the extension folder, and restarted webui. Then I tried reinstalling the extension. I did not get the console error messages this time about protobuf (probably because it is already installed). Just this:

Requirement already satisfied: protobuf==3.20.2 in .\venv\lib\site-packages (3.20.2)
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: onnx-graphsurgeon in .\venv\lib\site-packages (0.3.27)
Requirement already satisfied: numpy in .\venv\lib\site-packages (from onnx-graphsurgeon) (1.23.5)
Requirement already satisfied: onnx in .\venv\lib\site-packages (from onnx-graphsurgeon) (1.14.1)
Requirement already satisfied: typing-extensions>=3.6.2.1 in .\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (4.5.0)
Requirement already satisfied: protobuf>=3.20.2 in .\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (3.20.2)
GS is not installed! Installing...
Installing protobuf
Installing onnx-graphsurgeon
UI Config not initialized

Then when I restarted the webui, I got the popup errors again, the same console messages above, and when trying to export the default engine I get the same console error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!

MorkTheOrk commented 11 months ago

I can see that the packages are installed that our extension needs, but cudnn is still installed. The extension won't uninstall it because it only checks if tensorrt is installed, and if it is, it won't uninstall cudnn.... might need a code change here. You can try to run the following in the venv of automatic1111 :

python -m pip uninstall -y nvidia-cudnn-cu11
Jonseed commented 11 months ago

Tried running that, and got this:

WARNING: Ignoring invalid distribution -rotobuf (d:\repos\stable-diffusion-webui\venv\lib\site-packages)
WARNING: Ignoring invalid distribution -rotobuf (d:\repos\stable-diffusion-webui\venv\lib\site-packages)
Found existing installation: nvidia-cudnn-cu11 8.9.4.25
Uninstalling nvidia-cudnn-cu11-8.9.4.25:
  Successfully uninstalled nvidia-cudnn-cu11-8.9.4.25
ERROR: Exception:
Traceback (most recent call last):
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_internal\cli\base_command.py", line 167, in exc_logging_wrapper
    status = run_func(*args)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_internal\commands\uninstall.py", line 103, in run
    uninstall_pathset.commit()
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_internal\req\req_uninstall.py", line 424, in commit
    self._moved_paths.commit()
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_internal\req\req_uninstall.py", line 277, in commit
    save_dir.cleanup()
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_internal\utils\temp_dir.py", line 173, in cleanup
    rmtree(self._path)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 326, in wrapped_f
    return self(f, *args, **kw)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 406, in __call__
    do = self.iter(retry_state=retry_state)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 362, in iter
    raise retry_exc.reraise()
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 195, in reraise
    raise self.last_attempt.result()
  File "C:\Users\X\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 451, in result
    return self.__get_result()
  File "C:\Users\X\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 403, in __get_result
    raise self._exception
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_vendor\tenacity\__init__.py", line 409, in __call__
    result = fn(*args, **kwargs)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\pip\_internal\utils\misc.py", line 128, in rmtree
    shutil.rmtree(dir, ignore_errors=ignore_errors, onerror=rmtree_errorhandler)
  File "C:\Users\X\AppData\Local\Programs\Python\Python310\lib\shutil.py", line 749, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\Users\X\AppData\Local\Programs\Python\Python310\lib\shutil.py", line 614, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Users\X\AppData\Local\Programs\Python\Python310\lib\shutil.py", line 619, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "C:\Users\X\AppData\Local\Programs\Python\Python310\lib\shutil.py", line 617, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 5] Access is denied: 'D:\\repos\\stable-diffusion-webui\\venv\\Lib\\site-packages\\nvidia\\~udnn\\bin\\cudnn64_8.dll'
Jonseed commented 11 months ago

Restarted webui, and no more popup errors, but still get this when trying to export default engine: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

Jonseed commented 11 months ago

If I do a pip freeze now, I see this in the console:

WARNING: Ignoring invalid distribution -vidia-cudnn-cu11 (d:\repos\stable-diffusion-webui\venv\lib\site-packages)
WARNING: Ignoring invalid distribution -rotobuf (d:\repos\stable-diffusion-webui\venv\lib\site-packages)

And here is the current state of the pip freeze:

absl-py==1.4.0
accelerate==0.21.0
addict==2.4.0
aenum==3.1.12
aiofiles==23.2.1
aiohttp==3.8.4
aiosignal==1.3.1
altair==5.0.0
antlr4-python3-runtime==4.9.3
anyio==3.6.2
async-timeout==4.0.2
attrs==23.1.0
basicsr==1.4.2
beautifulsoup4==4.12.2
blendmodes==2022
blis==0.7.9
boltons==23.0.0
Brotli==1.1.0
cachetools==5.3.0
catalogue==2.0.8
certifi==2023.5.7
cffi==1.15.1
chardet==4.0.0
charset-normalizer==3.1.0
clean-fid==0.1.35
click==8.1.7
clip @ git+https://github.com/openai/CLIP.git@d50d76daa670286dd6cacf3bcd80b5e4823fc8e1
colorama==0.4.6
confection==0.0.4
contourpy==1.0.7
cssselect2==0.7.0
cycler==0.11.0
cymem==2.0.7
deprecation==2.1.0
duckduckgo-search==3.9.3
dynamicprompts==0.29.0
einops==0.4.1
facexlib==0.3.0
fastapi==0.94.0
ffmpy==0.3.0
filelock==3.12.0
filterpy==1.4.5
flatbuffers==23.5.9
font-roboto==0.0.1
fonts==0.0.3
fonttools==4.39.4
freetype-py==2.3.0
frozenlist==1.3.3
fsspec==2023.5.0
ftfy==6.1.1
future==0.18.3
fvcore==0.1.5.post20221221
gdown==4.7.1
gfpgan==1.3.8
gitdb==4.0.10
GitPython==3.1.32
google-auth==2.18.1
google-auth-oauthlib==1.0.0
gradio==3.41.2
gradio_client==0.5.0
grpcio==1.54.2
h11==0.12.0
h2==4.1.0
hpack==4.0.0
httpcore==0.15.0
httpx==0.24.1
huggingface-hub==0.14.1
hyperframe==6.0.1
idna==2.10
imageio==2.28.1
importlib-resources==6.0.1
inflection==0.5.1
iopath==0.1.9
Jinja2==3.1.2
jsonmerge==1.8.0
jsonschema==4.17.3
kiwisolver==1.4.4
kornia==0.6.7
langcodes==3.3.0
lark==1.1.2
lazy_loader==0.2
lightning-utilities==0.8.0
linkify-it-py==2.0.2
llvmlite==0.40.0
lmdb==1.4.1
lpips==0.1.4
lxml==4.9.3
Markdown==3.4.3
markdown-it-py==2.2.0
MarkupSafe==2.1.2
matplotlib==3.7.1
mdit-py-plugins==0.3.3
mdurl==0.1.2
mediapipe==0.10.5
mpmath==1.3.0
multidict==6.0.4
murmurhash==1.0.9
networkx==3.1
numba==0.57.0
numpy==1.23.5
nvidia-cublas-cu11==11.11.3.6
nvidia-cuda-nvrtc-cu11==11.8.89
nvidia-cuda-runtime-cu11==11.8.89
oauthlib==3.2.2
omegaconf==2.2.3
onnx==1.14.1
onnx-graphsurgeon==0.3.27
open-clip-torch==2.20.0
opencv-contrib-python==4.7.0.72
opencv-python==4.8.0.76
orjson==3.8.12
packaging==23.1
pandas==2.0.1
pathy==0.10.1
piexif==1.1.3
Pillow==9.5.0
polygraphy==0.49.0
portalocker==2.7.0
preshed==3.0.8
protobuf==3.20.2
psutil==5.9.5
py-cpuinfo==9.0.0
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycairo==1.23.0
pycparser==2.21
pydantic==1.10.7
pydub==0.25.1
Pygments==2.15.1
pyparsing==3.0.9
pyrsistent==0.19.3
PySocks==1.7.1
python-dateutil==2.8.2
python-multipart==0.0.6
pytorch-lightning==1.9.4
pytz==2023.3
PyWavelets==1.4.1
pywin32==306
PyYAML==6.0
realesrgan==0.3.0
regex==2023.5.5
reportlab==4.0.0
requests==2.31.0
requests-oauthlib==1.3.1
resize-right==0.0.2
rich==13.6.0
rlPyCairo==0.2.0
rsa==4.9
safetensors==0.3.1
scikit-image==0.21.0
scipy==1.10.1
seaborn==0.13.0
semantic-version==2.10.0
Send2Trash==1.8.0
sentencepiece==0.1.99
six==1.16.0
smart-open==6.3.0
smmap==5.0.0
sniffio==1.3.0
socksio==1.0.0
sounddevice==0.4.6
soupsieve==2.4.1
spacy==3.5.3
spacy-legacy==3.0.12
spacy-loggers==1.0.4
srsly==2.4.6
starlette==0.26.1
svglib==1.5.1
sympy==1.12
tabulate==0.9.0
tb-nightly==2.14.0a20230520
tensorboard-data-server==0.7.0
tensorrt==9.0.1.post11.dev4
tensorrt-bindings==9.0.1.post11.dev4
tensorrt-libs==9.0.1.post11.dev4
termcolor==2.3.0
thinc==8.1.10
thop==0.1.1.post2209072238
tifffile==2023.4.12
timm==0.9.2
tinycss2==1.2.1
tokenizers==0.13.3
tomesd==0.1.3
tomli==2.0.1
toolz==0.12.0
torch==2.0.1+cu118
torchdiffeq==0.2.3
torchmetrics==0.11.4
torchsde==0.2.5
torchvision==0.15.2+cu118
tqdm==4.65.0
trampoline==0.1.2
transformers==4.30.2
typer==0.7.0
typing_extensions==4.5.0
tzdata==2023.3
uc-micro-py==1.0.2
ultralytics==8.0.195
urllib3==1.26.15
uvicorn==0.22.0
wasabi==1.1.1
wcwidth==0.2.6
webencodings==0.5.1
websockets==11.0.3
Werkzeug==2.3.4
xformers==0.0.20
yacs==0.1.8
yapf==0.33.0
yarl==1.9.2

What is that about "-vidia-cudnn-cu11" being an invalid distribution?

Altrue commented 11 months ago

What is that about "-vidia-cudnn-cu11" being an invalid distribution?

From my understanding, when pip is installing a package, it removes the first letter of the package name. When the installation completes, this extra file is removed. When the install crashes, the extra file isn't removed and is left as a fake package that is ignored. It's fine to leave it, but you can also navigate to venv\Lib\site-packages to remove it.

(I had a "-rotobuf" warning myself :p)

andreacostigliolo commented 11 months ago

with python -m pip uninstall -y nvidia-cudnn-cu11 I also don't get more the error on startup. But I still get RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!

MorkTheOrk commented 11 months ago

That looks like a different error. I will figure something out so cudnn is being uninstalled even if tensorrt is installed.

andreacostigliolo commented 11 months ago

it's the same error of Jonseed

MorkTheOrk commented 11 months ago

I mean this error seems like a different one than the cudnn pop ups:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!
Jonseed commented 11 months ago

So it appears like I no longer have nvidia-cudnn-cu11 installed, per the pip freeze... Is that what we want?

niffelheim87 commented 11 months ago

Popups errors solved with python -m pip uninstall -y nvidia-cudnn-cu11 but still get RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm) when trying to generate images

DennLie commented 11 months ago

since i struggle with this error myself and tried various things. this is what i found out by now: delete and rebuild venv - no change deactivate all extensions and reinstall tensorrt extension - no change deactivate all COMMANDLINE_ARGS - Error is gone and it builded the TensorRT engine This were my COMMANDLINE_ARGS before = --api --xformers --medvram-sdxl --opt-sdp-attention --medvram --disable-safe-unpickle so i guess that one of them was causing the problem, i will get them now one by one back and check which one caused the error.

Jonseed commented 11 months ago

I can still generate images without errors. It is only when I try to export the default engine that I get the erorr about expecting all tensors to be on the same device...

MorkTheOrk commented 11 months ago

So it appears like I no longer have nvidia-cudnn-cu11 installed, per the pip freeze... Is that what we want?

Yes

Jonseed commented 11 months ago

my commandline args are: --autolaunch --update-check --xformers --no-half-vae --medvram --api

MorkTheOrk commented 11 months ago

Can you try without the xformers arg?

andreacostigliolo commented 11 months ago

the problem is medvram. Solved!

ByblosHex commented 11 months ago

Same errors as OP.

Altrue commented 11 months ago

For my part, the python -m pip uninstall -y nvidia-cudnn-cu11 didn't seem to work as the extension was "not installed".

So instead I went to venv\Lib\site-packages and I removed the cudnn.dist-info & the cudnn folder in the nvidia folder. It seems to be working fine. At least, I can start without errors, and I can start generating the engine. It seems to be generating without issues now.

Edit: I can confirm that after doing this fix, everything works for me.

MorkTheOrk commented 11 months ago

It might be that VRAM limitations fall back to CPU tensors and therefore are not working with the Torch ONNX export.

Jonseed commented 11 months ago

I removed xformers arg and when exporting default engine still get error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

Will try removing medvram arg...

ByblosHex commented 11 months ago

Nothing under SD Unet after generating engine completed?

MorkTheOrk commented 11 months ago

For my part, the python -m pip uninstall -y nvidia-cudnn-cu11 didn't seem to work as the extension was "not installed".

So instead I went to venv\Lib\site-packages and I removed the cudnn.dist-info & the cudnn folder in the nvidia folder. It seems to be working fine. At least, I can start without errors, and I can start generating the engine. It seems to be generating without issues now.

Have you exported the engine and enabled it? See here: https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT#how-to-use

andreacostigliolo commented 11 months ago

but after exporting should I use medvram again?

niffelheim87 commented 11 months ago

I'm the only one when generate images with already selected engine gets this error? image

Aeo2000 commented 11 months ago

I'm the only one when generate images with already selected engine gets this error? image

I'm getting the same errors. Tried deleting venv folder, extensions folder, getting rid of any arguements in the web batch file. now the tab for tensorrt doesnt even appear

ByblosHex commented 11 months ago

Following the steps in the discussion able to build an engine but it errors out at the saving step image

Altrue commented 11 months ago

For my part, the python -m pip uninstall -y nvidia-cudnn-cu11 didn't seem to work as the extension was "not installed". So instead I went to venv\Lib\site-packages and I removed the cudnn.dist-info & the cudnn folder in the nvidia folder. It seems to be working fine. At least, I can start without errors, and I can start generating the engine. It seems to be generating without issues now.

Have you exported the engine and enabled it? See here: https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT#how-to-use

Yes, I was currently following the instructions actually, to see if "seems to be working" could be converted into "is working". I can confirm it works flawlessly! I was able to export an engine and confirm that it increases my generation speed by ~40%!

In summary, for me the issue was that during the installation, somehow it didn't uninstall cudnn, and cudnn was taking priority over the dlls in tensorrt. After removing cudnn manually, my problem was solved.

Jonseed commented 11 months ago

Ok, so I removed medvram, and that did seem to be causing the problem with the tensors being on the same device. But now I get a bunch of other errors when trying to export default engine:

{'sample': [(1, 4, 64, 64), (2, 4, 64, 64), (8, 4, 96, 96)], 'timesteps': [(1,), (2,), (8,)], 'encoder_hidden_states': [(1, 77, 768), (2, 77, 768), (8, 154, 768)]}
Disabling attention optimization
D:\repos\stable-diffusion-webui\venv\lib\site-packages\einops\einops.py:314: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  known = {axis for axis in composite_axis if axis_name2known_length[axis] != _unknown_axis_length}
D:\repos\stable-diffusion-webui\venv\lib\site-packages\einops\einops.py:315: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  unknown = {axis for axis in composite_axis if axis_name2known_length[axis] == _unknown_axis_length}
D:\repos\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py:158: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert x.shape[1] == self.channels
D:\repos\stable-diffusion-webui\modules\sd_hijack_unet.py:26: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if a.shape[-2:] != b.shape[-2:]:
D:\repos\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py:109: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert x.shape[1] == self.channels
ERROR:asyncio:Exception in callback H11Protocol.timeout_keep_alive_handler()
handle: <TimerHandle when=7159.015 H11Protocol.timeout_keep_alive_handler()>
Traceback (most recent call last):
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 249, in _fire_event_triggered_transitions
    new_state = EVENT_TRIGGERED_TRANSITIONS[role][state][event_type]
KeyError: <class 'h11._events.ConnectionClosed'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\X\AppData\Local\Programs\Python\Python310\lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 383, in timeout_keep_alive_handler
    self.conn.send(event)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 468, in send
    data_list = self.send_with_data_passthrough(event)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 493, in send_with_data_passthrough
    self._process_event(self.our_role, event)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 242, in _process_event
    self._cstate.process_event(role, type(event), server_switch_event)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 238, in process_event
    self._fire_event_triggered_transitions(role, event_type)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 251, in _fire_event_triggered_transitions
    raise LocalProtocolError(
h11._util.LocalProtocolError: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE

I thought it might be the model I had selected, so I switched to the SD 1.5 model, and then got this error:

Loading model v1-5-pruned-emaonly.safetensors [6ce0161689] (2 out of 3)
Loading weights [6ce0161689] from D:\repos\stable-diffusion-webui\models\Stable-diffusion\v1-5-pruned-emaonly.safetensors
Creating model from config: D:\repos\stable-diffusion-webui\configs\v1-inference.yaml
============= Diagnostic Run torch.onnx.export version 2.0.1+cu118 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[E] ONNX-Runtime is not installed, so constant folding may be suboptimal or not work at all.
    Consider installing ONNX-Runtime: D:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install onnxruntime
[I] Folding Constants | Pass 1
[!] Module: 'onnxruntime.tools.symbolic_shape_infer' is required but could not be imported.
    Note: Error was: No module named 'onnxruntime'
    You can set POLYGRAPHY_AUTOINSTALL_DEPS=1 in your environment variables to allow Polygraphy to automatically install missing modules.
[W] Falling back to `onnx.shape_inference` because `onnxruntime.tools.symbolic_shape_infer` either could not be loaded or did not run successfully.
    Note that using ONNX-Runtime for shape inference may be faster and require less memory.
    Consider installing ONNX-Runtime or setting POLYGRAPHY_AUTOINSTALL_DEPS=1 in your environment variables to allow Polygraphy to do so automatically.
[W] Attempting to run shape inference on a large model (1641.0 MiB). This may require a large amount of memory.
    If memory consumption becomes too high, the process may be killed. You may want to try disabling shape inference in that case.
[I]     Total Nodes | Original:  8992, After Folding:  6216 |  2776 Nodes Folded
[I] Folding Constants | Pass 2
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
No module named 'onnxruntime'
[!] Module: 'onnxruntime.tools.symbolic_shape_infer' is required but could not be imported.
    Note: Error was: No module named 'onnxruntime'
    You can set POLYGRAPHY_AUTOINSTALL_DEPS=1 in your environment variables to allow Polygraphy to automatically install missing modules.
[W] Attempting to run shape inference on a large model (1642.0 MiB). This may require a large amount of memory.
    If memory consumption becomes too high, the process may be killed. You may want to try disabling shape inference in that case.
[I]     Total Nodes | Original:  6216, After Folding:  6152 |    64 Nodes Folded
[I] Folding Constants | Pass 3
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
No module named 'onnxruntime'
[!] Module: 'onnxruntime.tools.symbolic_shape_infer' is required but could not be imported.
    Note: Error was: No module named 'onnxruntime'
    You can set POLYGRAPHY_AUTOINSTALL_DEPS=1 in your environment variables to allow Polygraphy to automatically install missing modules.
[I]     Total Nodes | Original:  6152, After Folding:  6152 |     0 Nodes Folded
*** API error: POST: http://127.0.0.1:7860/api/predict {'error': 'LocalProtocolError', 'detail': '', 'body': '', 'errors': "Can't send data when our state is ERROR"}
    Traceback (most recent call last):
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 162, in __call__
        await self.app(scope, receive, _send)
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\base.py", line 109, in __call__
        await response(scope, receive, send)
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 270, in __call__
        async with anyio.create_task_group() as task_group:
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 662, in __aexit__
        raise exceptions[0]
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 273, in wrap
        await func()
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\base.py", line 134, in stream_response
        return await super().stream_response(send)
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 255, in stream_response
        await send(
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 159, in _send
        await send(message)
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 512, in send
        output = self.conn.send(event)
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 468, in send
        data_list = self.send_with_data_passthrough(event)
      File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 483, in send_with_data_passthrough
        raise LocalProtocolError("Can't send data when our state is ERROR")
    h11._util.LocalProtocolError: Can't send data when our state is ERROR

---
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 428, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\fastapi\applications.py", line 273, in __call__
    await super().__call__(scope, receive, send)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 184, in __call__
    raise exc
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\base.py", line 109, in __call__
    await response(scope, receive, send)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 270, in __call__
    async with anyio.create_task_group() as task_group:
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 662, in __aexit__
    raise exceptions[0]
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 273, in wrap
    await func()
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\base.py", line 134, in stream_response
    return await super().stream_response(send)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 255, in stream_response
    await send(
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 159, in _send
    await send(message)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 512, in send
    output = self.conn.send(event)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 468, in send
    data_list = self.send_with_data_passthrough(event)
  File "D:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 483, in send_with_data_passthrough
    raise LocalProtocolError("Can't send data when our state is ERROR")
h11._util.LocalProtocolError: Can't send data when our state is ERROR
'AsyncRequest' object has no attribute '_json_response_data'
Applying attention optimization: sdp-no-mem... done.
Model loaded in 112.6s (load weights from disk: 15.3s, create model: 0.5s, apply weights to model: 96.5s).
ByblosHex commented 11 months ago

For my part, the python -m pip uninstall -y nvidia-cudnn-cu11 didn't seem to work as the extension was "not installed". So instead I went to venv\Lib\site-packages and I removed the cudnn.dist-info & the cudnn folder in the nvidia folder. It seems to be working fine. At least, I can start without errors, and I can start generating the engine. It seems to be generating without issues now.

Have you exported the engine and enabled it? See here: https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT#how-to-use

Yes, I was currently following the instructions actually, to see if "seems to be working" could be converted into "is working". I can confirm it works flawlessly! I was able to export an engine and confirm that it increases my generation speed by ~40%!

In summary, for me the issue was that during the installation, somehow it didn't uninstall cudnn, and cudnn was taking priority over the dlls in tensorrt. After removing cudnn manually, my problem was solved.

What did you remove?

Jonseed commented 11 months ago

I tried stopping and restarting the webui server. I made sure I am on the SD 1.5 model. If I try to export the default engine I get the same full error that I did above, ending in 'AsyncRequest' object has no attribute '_json_response_data' I don't think it was the model I was on, as it is giving the same error with SD 1.5.

andreacostigliolo commented 11 months ago

So now I can export model (with some out of memory errors on 12GB), but when I try to use I get again this error :
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm). Thi error I got exporting with medvram but now I don't have it on startup

Jonseed commented 11 months ago

I thought I might try uninstalling cudnn manually as @Altrue did, but when I go to venv\Lib\site-packages I don't see folders for cudnn.dist-info & cudnn folders in the nvidia folder. Here is what I have in the nvidia folder:

Screenshot 2023-10-17 120928

Jonseed commented 11 months ago

It seems my error now might be this: No module named 'onnxruntime'

Jonseed commented 11 months ago

Should I try manually installing onnxruntime in the venv per python.exe -m pip install onnxruntime?