NVIDIA / Stable-Diffusion-WebUI-TensorRT

TensorRT Extension for Stable Diffusion Web UI
MIT License
1.91k stars 146 forks source link

Error installing in Automatic1111 #12

Closed Jonseed closed 1 year ago

Jonseed commented 1 year ago

Here is the error in the console:

Error running install.py for extension D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT.
*** Command: "d:\repos\stable-diffusion-webui\venv\Scripts\python.exe" "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\install.py"
*** Error code: 1
*** stdout: Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com
*** Collecting tensorrt==9.0.1.post11.dev4
***   Downloading https://pypi.nvidia.com/tensorrt/tensorrt-9.0.1.post11.dev4.tar.gz (18 kB)
***   Preparing metadata (setup.py): started
***   Preparing metadata (setup.py): finished with status 'done'
*** Building wheels for collected packages: tensorrt
***   Building wheel for tensorrt (setup.py): started
***   Building wheel for tensorrt (setup.py): still running...
***   Building wheel for tensorrt (setup.py): finished with status 'done'
***   Created wheel for tensorrt: filename=tensorrt-9.0.1.post11.dev4-py2.py3-none-any.whl size=17618 sha256=e059e2b3b7dd7ecf4c805ab6f2b4589ddb43b0959bfa66178fa0d01559ba1ef8
***   Stored in directory: c:\users\X\appdata\local\pip\cache\wheels\d1\6d\71\f679d0d23a60523f9a05445e269bfd0bcd1c5272097fa931df
*** Successfully built tensorrt
*** Installing collected packages: tensorrt
*** Successfully installed tensorrt-9.0.1.post11.dev4
*** Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
*** Collecting polygraphy
***   Downloading polygraphy-0.49.0-py2.py3-none-any.whl (327 kB)
***      -------------------------------------- 327.9/327.9 kB 4.1 MB/s eta 0:00:00
*** Installing collected packages: polygraphy
*** Successfully installed polygraphy-0.49.0
*** Collecting protobuf==3.20.2
***   Downloading protobuf-3.20.2-cp310-cp310-win_amd64.whl (904 kB)
***      -------------------------------------- 904.0/904.0 kB 4.4 MB/s eta 0:00:00
*** Installing collected packages: protobuf
***   Attempting uninstall: protobuf
***     Found existing installation: protobuf 3.20.0
***     Uninstalling protobuf-3.20.0:
***       Successfully uninstalled protobuf-3.20.0
*** TensorRT is not installed! Installing...
*** Installing nvidia-cudnn-cu11
*** Installing tensorrt
*** removing nvidia-cudnn-cu11
*** Polygraphy is not installed! Installing...
*** Installing polygraphy
*** GS is not installed! Installing...
*** Installing protobuf
***
*** stderr: A matching Triton is not available, some optimizations will not be enabled.
*** Error caught was: No module named 'triton'
*** d:\repos\stable-diffusion-webui\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
***   rank_zero_deprecation(
***
*** [notice] A new release of pip available: 22.2.1 -> 23.2.1
*** [notice] To update, run: d:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip
***
*** [notice] A new release of pip available: 22.2.1 -> 23.2.1
*** [notice] To update, run: d:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip
*** ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'D:\\repos\\stable-diffusion-webui\\venv\\Lib\\site-packages\\google\\~rotobuf\\internal\\_api_implementation.cp310-win_amd64.pyd'
*** Check the permissions.
***
***
*** [notice] A new release of pip available: 22.2.1 -> 23.2.1
*** [notice] To update, run: d:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip
*** Traceback (most recent call last):
***   File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\install.py", line 30, in <module>***     install()
***   File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\install.py", line 19, in install
***     launch.run_pip("install protobuf==3.20.2", "protobuf", live=True)
***   File "d:\repos\stable-diffusion-webui\modules\launch_utils.py", line 138, in run_pip
***     return run(f'"{python}" -m pip {command} --prefer-binary{index_url_line}', desc=f"Installing {desc}", errdesc=f"Couldn't install {desc}", live=live)
***   File "d:\repos\stable-diffusion-webui\modules\launch_utils.py", line 115, in run
***     raise RuntimeError("\n".join(error_bits))
*** RuntimeError: Couldn't install protobuf.
*** Command: "d:\repos\stable-diffusion-webui\venv\Scripts\python.exe" -m pip install protobuf==3.20.2 --prefer-binary
*** Error code: 1

And then when I restarted the webui, I got these popups:

Screenshot 2023-10-17 102416 Screenshot 2023-10-17 102706 Screenshot 2023-10-17 102950 Screenshot 2023-10-17 103000

What does that mean?

aria1th commented 1 year ago

onnxruntime-gpu afaik, but it should be added in install.py if its the case

Jonseed commented 1 year ago

I already have onnx and onnx-graphsurgeon installed. Do I need onnxruntime or onnxruntime-gpu?

Jonseed commented 1 year ago

After all that, I just restarted the webui, and tried exporting the default engine again, and now it seems to be exporting... I did not install onnxruntime or onnxruntime-gpu...

It still output this error to the console, even though it continued with exporting this time.

ERROR:asyncio:Exception in callback H11Protocol.timeout_keep_alive_handler()
handle: <TimerHandle when=8572.921 H11Protocol.timeout_keep_alive_handler()>
Traceback (most recent call last):
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 249, in _fire_event_triggered_transitions
    new_state = EVENT_TRIGGERED_TRANSITIONS[role][state][event_type]
KeyError: <class 'h11._events.ConnectionClosed'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\X\AppData\Local\Programs\Python\Python310\lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 383, in timeout_keep_alive_handler
    self.conn.send(event)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 468, in send
    data_list = self.send_with_data_passthrough(event)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 493, in send_with_data_passthrough
    self._process_event(self.our_role, event)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 242, in _process_event
    self._cstate.process_event(role, type(event), server_switch_event)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 238, in process_event
    self._fire_event_triggered_transitions(role, event_type)
  File "d:\repos\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 251, in _fire_event_triggered_transitions
    raise LocalProtocolError(
h11._util.LocalProtocolError: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE

Maybe this error is from another extension I have active that is using the API (the Photoshop plugin)?

4lt3r3go commented 1 year ago

I'm the only one when generate images with already selected engine gets this error? image

I'm getting the same errors. Tried deleting venv folder, extensions folder, getting rid of any arguements in the web batch file. now the tab for tensorrt doesnt even appear

SAME. tab disappeared, no matter what i do reinstall, upgrade requirement. nothing

Terroosh commented 1 year ago

I was having the "procedure entry point" error popups as well but deleting the cudnn folder in venv\Lib\site-packages\nvidia fixed it. I'm able to generate images now.

Tonic0 commented 1 year ago

I'm the only one when generate images with already selected engine gets this error? image

I'm getting the same errors. Tried deleting venv folder, extensions folder, getting rid of any arguements in the web batch file. now the tab for tensorrt doesnt even appear

Same issue here. Removed TensorRT extension, deleted venv folder. Started WebUI, venv reinstalled, reinstalled TensorRT extension, now it doesn't appear in UI anymore. What's the procedure to fully uninstall/reinstall TensortRT without deleting the A1111 WebUI completely?

Error:

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: onnx-graphsurgeon in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (0.3.27)
Requirement already satisfied: numpy in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (from onnx-graphsurgeon) (1.23.5)
Requirement already satisfied: onnx in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (from onnx-graphsurgeon) (1.14.1)
Requirement already satisfied: protobuf>=3.20.2 in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (3.20.2)
Requirement already satisfied: typing-extensions>=3.6.2.1 in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (4.8.0)
GS is not installed! Installing...
Installing protobuf
Installing onnx-graphsurgeon
UI Config not initialized
Launching Web UI with arguments:
No module 'xformers'. Proceeding without it.
*** Error loading script: trt.py
    Traceback (most recent call last):
      File "D:\Scratch\stable-diffusion-webui\modules\scripts.py", line 382, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "D:\Scratch\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "D:\Scratch\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in <module>
        import ui_trt
      File "D:\Scratch\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 10, in <module>
        from exporter import export_onnx, export_trt
      File "D:\Scratch\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 10, in <module>
        from utilities import Engine
      File "D:\Scratch\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\utilities.py", line 32, in <module>
        import tensorrt as trt
      File "D:\Scratch\stable-diffusion-webui\venv\lib\site-packages\tensorrt\__init__.py", line 18, in <module>
        from tensorrt_bindings import *
    ModuleNotFoundError: No module named 'tensorrt_bindings'
Aeo2000 commented 1 year ago

I'm the only one when generate images with already selected engine gets this error? image

I'm getting the same errors. Tried deleting venv folder, extensions folder, getting rid of any arguements in the web batch file. now the tab for tensorrt doesnt even appear

SAME. tab disappeared, no matter what i do reinstall, upgrade requirement. nothing

same i been trying to fix this the past hour. anyone find any fixes yet?

4lt3r3go commented 1 year ago

i also foudn that if i remove the models i converted then the extention disappear from the UI. restoring those deleted files will make the extention appear again... wtf lol

racerx2oo3 commented 1 year ago

I'm the only one when generate images with already selected engine gets this error? image

I'm getting the same errors. Tried deleting venv folder, extensions folder, getting rid of any arguements in the web batch file. now the tab for tensorrt doesnt even appear

Same issue here. Removed TensorRT extension, deleted venv folder. Started WebUI, venv reinstalled, reinstalled TensorRT extension, now it doesn't appear in UI anymore. What's the procedure to fully uninstall/reinstall TensortRT without deleting the A1111 WebUI completely?

Error:

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: onnx-graphsurgeon in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (0.3.27)
Requirement already satisfied: numpy in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (from onnx-graphsurgeon) (1.23.5)
Requirement already satisfied: onnx in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (from onnx-graphsurgeon) (1.14.1)
Requirement already satisfied: protobuf>=3.20.2 in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (3.20.2)
Requirement already satisfied: typing-extensions>=3.6.2.1 in d:\scratch\stable-diffusion-webui\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (4.8.0)
GS is not installed! Installing...
Installing protobuf
Installing onnx-graphsurgeon
UI Config not initialized
Launching Web UI with arguments:
No module 'xformers'. Proceeding without it.
*** Error loading script: trt.py
    Traceback (most recent call last):
      File "D:\Scratch\stable-diffusion-webui\modules\scripts.py", line 382, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "D:\Scratch\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "D:\Scratch\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in <module>
        import ui_trt
      File "D:\Scratch\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 10, in <module>
        from exporter import export_onnx, export_trt
      File "D:\Scratch\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 10, in <module>
        from utilities import Engine
      File "D:\Scratch\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\utilities.py", line 32, in <module>
        import tensorrt as trt
      File "D:\Scratch\stable-diffusion-webui\venv\lib\site-packages\tensorrt\__init__.py", line 18, in <module>
        from tensorrt_bindings import *
    ModuleNotFoundError: No module named 'tensorrt_bindings'

Deleting the stable-diffusion-webui-tensorrt extension in the extensions folder should be all you need to do.

Jonseed commented 1 year ago

So it finally exported the default engine for SD 1.5, and I see it in models\Unet-trt\, but now SD Unet doesn't show it in the dropdown menu. It only shows automatic or none... Refreshing that list does nothing. I have tried restarting the UI. I will now restart webui server to see if that helps.

racerx2oo3 commented 1 year ago

i also foudn that if i remove the models i converted then the extention disappear from the UI. restoring those deleted files will make the extention appear again... wtf lol

You removed the generated trt model from the Unet-trt folder? If so you need to also remove the reference to that model in the model.json folder. Or if you only have one model generated, just delete the model file and the .json file both.

racerx2oo3 commented 1 year ago

So it finally exported the default engine for SD 1.5, and I see it in models\Unet-trt, but now SD Unet doesn't show it in the dropdown menu. It only shows automatic or none... Refreshing that list does nothing. I have tried restarting the UI. I will now restart webui server to see if that helps.

Do you have a model file in the models/Unet-trt folder? If so, it might be that the model.json file didn't get properly written. I had that happen at one point.

If you open the model.json file and it is empty than that is the issue.

Jonseed commented 1 year ago

Yes, I have a model file in the models/Unet-trt folder, but the model.json file is empty, just {}. How do I regenerate the model.json file? Do I have to export the default engine again?

racerx2oo3 commented 1 year ago

Yes, I have a model file in the models/Unet-trt folder, but the model.json file is empty, just {}. How do I regenerate the model.json file? Do I have to export the default engine again?

Normally you'd need to rebuild the profile, I don't think there's a clear way to regenerate the model.json file.

You can try deleting everything in the model.json file and pasting in this { "cc89": { "v1-5-pruned-emaonly": [ { "filepath": "v1-5-pruned-emaonly_d7049739_cc89_sample=1x4x64x64+2x4x64x64+8x4x96x96-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt", "config": { "profile": { "sample": [ [ 1, 4, 64, 64 ], [ 2, 4, 64, 64 ], [ 8, 4, 96, 96 ] ], "timesteps": [ [ 1 ], [ 2 ], [ 8 ] ], "encoder_hidden_states": [ [ 1, 77, 768 ], [ 2, 77, 768 ], [ 8, 154, 768 ] ] }, "static_shapes": false, "fp32": false, "inpaint": false, "refit": true, "lora": false, "vram": 0, "unet_hidden_dim": 4 } } ] } }

Jonseed commented 1 year ago

Hmm, I deleted everything in the model.json file, and pasted in that, and still nothing in the dropdown. I restarted UI, still nothing. Will now restart webui server. After restarting, and refreshing the dropdown menu, still nothing. It looks like I may have to export the default engine again? Should I delete everything in the models/Unet-trt folder? What about the files in models/Unet-onnx folder?

4lt3r3go commented 1 year ago

image

what is this now

racerx2oo3 commented 1 year ago

image

what is this now

Looks like you specified settings in your image generation that aren't supported by an engine you have built. What are you specifying for your image generation parameters?

ByblosHex commented 1 year ago

I have two models in the folder, but the .json is empty as well.

4lt3r3go commented 1 year ago

image what is this now

Looks like you specified settings in your image generation that aren't supported by an engine you have built. What are you specifying for your image generation parameters?

this sounds more like that the model i converted, wich was a custom size (512x768 batch 1) wasnt working as expected. So now i'm redoing it using default settings. i already used TensorRT a lot in the past, and i remeber that there were some trial and error in converting models.. maybe this is the case here.

Jonseed commented 1 year ago

Restarting the webui server, and got this error:

*** Error loading script: trt.py
    Traceback (most recent call last):
      File "D:\repos\stable-diffusion-webui\modules\scripts.py", line 382, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "D:\repos\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in <module>
        import ui_trt
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 16, in <module>
        from model_manager import modelmanager, cc_major, TRT_MODEL_DIR
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 219, in <module>
        modelmanager = ModelManager()
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 35, in __init__
        self.update()
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 67, in update
        for base_model, models in base_models.items():
    RuntimeError: dictionary changed size during iteration

Could be because we tried pasting in the values to model.json? Looks like it deleted everything in the model.json file when I restarted the webui server...

ByblosHex commented 1 year ago

In my Nvidia folder there is a blank _init.py file

racerx2oo3 commented 1 year ago

Restarting the webui server, and got this error:

*** Error loading script: trt.py
    Traceback (most recent call last):
      File "D:\repos\stable-diffusion-webui\modules\scripts.py", line 382, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "D:\repos\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in <module>
        import ui_trt
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 16, in <module>
        from model_manager import modelmanager, cc_major, TRT_MODEL_DIR
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 219, in <module>
        modelmanager = ModelManager()
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 35, in __init__
        self.update()
      File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 67, in update
        for base_model, models in base_models.items():
    RuntimeError: dictionary changed size during iteration

Could be because we tried pasting in the values to model.json? Looks like it deleted everything in the model.json file when I restarted the webui server...

Could be, I wasn't really sure if that would work or not.

Jonseed commented 1 year ago

Ok, I'm going to delete everything in the models/Unet-trt and models/Unet-onnx folders, reboot webui server, and try exporting default engine again. Hopefully this time the model.json file is created correctly.

Jonseed commented 1 year ago

Exporting now... It is still reporting that ONNX-Runtime is not installed, and it looks like Polygraphy uses it, and it is giving a bunch of errors because of that, but the export continues. Do I want to manually install that module?

racerx2oo3 commented 1 year ago

Exporting now... It is still reporting that ONNX-Runtime is not installed, and it looks like Polygraphy uses it, and it is giving a bunch of errors because of that, but the export continues. Do I want to manually install that module?

No it's fine. That error is "normal".

Jonseed commented 1 year ago

Export finished, and properly added to SD Unet dropdown. I can now use the TensorRT model, and it is indeed much faster on my 3060.

So it seems my errors were mostly caused by --medvram (which I still have not re-enabled), possibly conflicting with another extension (Photoshop plugin) using the --api. I had to manually uninstall cudnn via python -m pip uninstall -y nvidia-cudnn-cu11.

After all that, it seems to be working. I will try to reenable --medvram and --api commandline args and see if it still works.

andreacostigliolo commented 1 year ago

with 12 gb I get W] Requested amount of GPU memory (11922309120 bytes) could not be allocated. There may not be enough free memory for allocation to succeed. [W] UNSUPPORTED_STATESkipping tactic 0 due to insufficient memory on requested size of 11922309120 detected for tactic 0x0000000000000000. during exporting.

Jonseed commented 1 year ago

with 12 gb I get W] Requested amount of GPU memory (11922309120 bytes) could not be allocated. There may not be enough free memory for allocation to succeed. [W] UNSUPPORTED_STATESkipping tactic 0 due to insufficient memory on requested size of 11922309120 detected for tactic 0x0000000000000000. during exporting.

I also got that error on my 3060 with 12gb vram, but the export continued, and seems to work.

racerx2oo3 commented 1 year ago

with 12 gb I get W] Requested amount of GPU memory (11922309120 bytes) could not be allocated. There may not be enough free memory for allocation to succeed. [W] UNSUPPORTED_STATESkipping tactic 0 due to insufficient memory on requested size of 11922309120 detected for tactic 0x0000000000000000. during exporting.

If the engine generation continues after getting that message, its fine. The engine build process will try different "tactics" to build the TensorRT engine and some use more memory than others, if a given tactic can't be used the build should continue with a different tactic. If the build fails to build after getting a memory error, then the specified engine is probably running out of VRAM.

niffelheim87 commented 1 year ago

Tried this time with sd 1.5 instead of sdXL and now its working. sd 1.5 from 3-4it/s to 10/11it/s batch size 1

Jonseed commented 1 year ago

Ok, I re-enabled --medvram and --api, and the TensorRT model I built before seems to still work. I do get this message when I use it:

[W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading

Should I enable CUDA lazy loading? How would I do that?

With --medvram re-enabled, I cannot export any new engines though. I get this error as I did before:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

So something with the --medvram argument is conflicting here. I'll have to remove that commandline arg whenever I want to export new engines until fixed.

racerx2oo3 commented 1 year ago

CUDA Lazy loading can help with memory usage, it's enabled using an environment variable: SET CUDA_MODULE_LOADING=LAZY

racerx2oo3 commented 1 year ago

CUDA Lazy loading can help with memory usage, it's enabled using an environment variable: SET CUDA_MODULE_LOADING=LAZY

I'm not sure exactly how much it helps in this case, but you can give it a shot.

andreacostigliolo commented 1 year ago

Generating images with unet gives me this error : RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

Any idea?

andreacostigliolo commented 1 year ago

If I change the vae before generating image it seems to work. BTW it's not faster :(

racerx2oo3 commented 1 year ago

Generating images with unet gives me this error : RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

Any idea?

Are you trying to use SDXL?

Jonseed commented 1 year ago

As far as performance goes on my 3060, using SD 1.5 at 512x512 resolution, for a batch size of 1, I'm getting speed increases from about 3.9 it/s to 9 it/s, an increase of 2.3x! That's impressive.

But if I increase the batch size, something strange happens. With TRT set to 'none', the generation takes way longer now. A batch size of 4, without TRT (SD Unet set to 'none'), takes a minute and 40 seconds (1:40), whereas with a batch size of 1, generating 4 images takes only 25 seconds when TRT is off. What is going on here? Something seems wrong. Is the TRT UNET not deactivated correctly when TRT is set to 'none' for batch sizes greater than 1? With TRT active, a batch size of 4 is very fast, generating 4 images in only 8 seconds.

racerx2oo3 commented 1 year ago

As far as performance goes on my 3060, using SD 1.5 at 512x512 resolution, for a batch size of 1, I'm getting speed increases from about 3.9 it/s to 9 it/s, an increase of 2.3x! That's impressive.

But if I increase the batch size, something strange happens. With TRT set to 'none', the generation takes way longer now. A batch size of 4, without TRT (SD Unet set to 'none'), takes a minute and 40 seconds (1:40), whereas with a batch size of 1, generating 4 images takes only 25 seconds when TRT is off. What is going on here? Something seems wrong. Is the TRT UNET not deactivated correctly when TRT is set to 'none' for batch sizes greater than 1? With TRT active, a batch size of 4 is very fast, generating 4 images in only 8 seconds.

You're likely hitting the memory limits of your GPU. Your --medvram and --xformers command line arguments previously were directed at allowing you to use less VRAM for image generation. So without those active, you're probably hitting VRAM limitations that are slowing you down at larger batch sizes.

TensorRT is more memory efficient than vanilla PyTorch so it can generally handle larger batch sizes. If you want to run larger batch sizes without using TensorRT you'll likely need to re-enable your previous memory optimizations.

andreacostigliolo commented 1 year ago

Generating images with unet gives me this error : RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm) Any idea?

Are you trying to use SDXL?

yes

Akyariss commented 1 year ago

Generating images with unet gives me this error : RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm) Any idea?

Are you trying to use SDXL?

yes

I have the same issue with SDXL and TRT, works fine without

Jonseed commented 1 year ago

These performance numbers were with --medvram and --xformers enabled (although I'm using sdp-no-mem at the moment, not xformers).

If I disable the TensorRT extension, then my speeds are about 5 it/s with batch size 1 (4 images in 19 seconds), and 2.2 it/s with batch size of 4 (11 seconds). So even with a batch size of 1, the speeds are slower with the TensorRT extension enabled (5 it/s vs 3.9 it/s). So it seems there is a problem here...

racerx2oo3 commented 1 year ago

Generating images with unet gives me this error : RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm) Any idea?

Are you trying to use SDXL?

yes

The extension supports SDXL, but it relies on functionality that hasn't been implemented in the release branch. If you need to work with SDXL you'll need to use a Automatic1111 build from the Dev branch at the moment. Note that the Dev branch is not intended for production work and may break other things that you are currently using.

andreacostigliolo commented 1 year ago

Generating images with unet gives me this error : RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm) Any idea?

Are you trying to use SDXL?

yes

I have the same issue with SDXL and TRT, works fine without

If i change the vae few times (enabled on the quick menù) it seems starting. Can't understand....

alexbofa commented 1 year ago

This is an error because it broke when downloading. Copy from here Stable-Diffusion\venv\Lib\site-packages\torch\lib to here Stable-Diffusion\venv\Lib\site-packages\nvidia\cudnn\bin

andreacostigliolo commented 1 year ago

cudnn

I don't have cudnn folder.

Aeo2000 commented 1 year ago

the TensorRT tab is still isn't appearing for me. After I uninstalled it the first time it no longer appears. Anyone facing the same issue?

i am seeing this in the command terminal, could that have something to do with tensorrt not appearing? image

Zw3pter commented 1 year ago

Can anyone tell me whats wrong here? i tried to reinstall it multiple times and even tried to reinstall automatic1111 multiple times. I got it working for some odd reason on my old version, but i accidently deleted that.

Launching Web UI with arguments: --xformers ** Error loading script: trt.py Traceback (most recent call last): File "E:\AIsoftware\stable-diffusion-webui\modules\scripts.py", line 382, in load_scripts script_module = script_loading.load_module(scriptfile.path) File "E:\AIsoftware\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module module_spec.loader.exec_module(module) File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "E:\AIsoftware\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in import ui_trt File "E:\AIsoftware\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 10, in from exporter import export_onnx, export_trt File "E:\AIsoftware\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 10, in from utilities import Engine File "E:\AIsoftware\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\utilities.py", line 32, in import tensorrt as trt File "E:\AIsoftware\stable-diffusion-webui\venv\lib\site-packages\tensorrt__init__.py", line 18, in from tensorrt_bindings import ModuleNotFoundError: No module named 'tensorrt_bindings'

verifiedsyckozis commented 1 year ago

I got mine to export a model thing, but I get this every time I try to use it... _RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDAaddmm) I saw up there someone else had the same problem.

racerx2oo3 commented 1 year ago

Can anyone tell me whats wrong here? i tried to reinstall it multiple times and even tried to reinstall automatic1111 multiple times. I got it working for some odd reason on my old version, but i accidently deleted that.

Launching Web UI with arguments: --xformers ** Error loading script: trt.py Traceback (most recent call last): File "E:\AIsoftware\stable-diffusion-webui\modules\scripts.py", line 382, in load_scripts script_module = script_loading.load_module(scriptfile.path) File "E:\AIsoftware\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module module_spec.loader.exec_module(module) File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "E:\AIsoftware\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in import ui_trt File "E:\AIsoftware\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 10, in from exporter import export_onnx, export_trt File "E:\AIsoftware\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 10, in from utilities import Engine File "E:\AIsoftware\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\utilities.py", line 32, in import tensorrt as trt File "E:\AIsoftware\stable-diffusion-webui\venv\lib\site-packages\tensorrtinit.py", line 18, in from tensorrt_bindings import ModuleNotFoundError: No module named 'tensorrt_bindings'

Are you installing using the "Available" tab within extensions? If so that hasn't yet been updated to point to this current version. You'll want to delete the stable-diffusion-webui-tesnsorrt folder found in the extensions folder. And the install using the Install from URL option.

racerx2oo3 commented 1 year ago

I got mine to export a model thing, but I get this every time I try to use it... _RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDAaddmm) I saw up there someone else had the same problem.

If you are trying to use SDXL then this is currently expected. The extension supports SDXL, but it relies on functionality that hasn't been implemented in the release branch of Automatic1111. If you need to work with SDXL you'll need to use a Automatic1111 build from the Dev branch at the moment. Note that the Dev branch is not intended for production work and may break other things that you are currently using.