lshqqytiger / stable-diffusion-webui-amdgpu

Stable Diffusion web UI
GNU Affero General Public License v3.0
1.87k stars 191 forks source link

it/s drop after next restart by 7-8its losing #460

Closed neonarc4 closed 6 months ago

neonarc4 commented 6 months ago

Checklist

What happened?

once the model convert via onnx it give good speed but when we reopen webui user bat it drop the it/s by 8 from 27

Steps to reproduce the problem

  1. convert model to onnx
  2. get output and performance was correct as mention
  3. closing webui and restart short of then it lose the it/s
  4. do we need to convert model all time in order to get those speed?

What should have happened?

why it losing it/s

What browsers do you use to access the UI ?

Mozilla Firefox, Google Chrome, Brave, Other

Sysinfo

latest

Console logs

htningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
  rank_zero_deprecation(
Launching Web UI with arguments: --use-cpu-torch --update-all-extensions --opt-sub-quad-attention --opt-split-attention --precision autocast
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
ONNX: version=1.17.3 provider=DmlExecutionProvider, available=['DmlExecutionProvider', 'CPUExecutionProvider']
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 13.4s (prepare environment: 19.5s, initialize shared: 1.2s, load scripts: 1.0s, create ui: 0.8s, gradio launch: 0.3s).
X:\Nebula\dml\stable-diffusion-webui-directml\venv\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
Applying attention optimization: sdp... done.
WARNING: ONNX implementation works best with SD.Next. Please consider migrating to SD.Next.
Olive implementation is experimental. It contains potentially an issue and is subject to change at any time.
2024-05-12 02:05:40.9688838 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-05-12 02:05:40.9738781 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
2024-05-12 02:05:42.5783678 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-05-12 02:05:42.5828893 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
2024-05-12 02:05:42.9718586 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-05-12 02:05:42.9776167 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
2024-05-12 02:05:43.2171474 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-05-12 02:05:43.2217540 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
ONNX: processing=StableDiffusionProcessingTxt2Img, pipeline=OnnxStableDiffusionPipeline
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 16.00it/s]
ONNX: processing=StableDiffusionProcessingTxt2Img, pipeline=OnnxStableDiffusionPipeline
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 19.21it/s]
ONNX: processing=StableDiffusionProcessingTxt2Img, pipeline=OnnxStableDiffusionPipeline
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 19.23it/s]
ONNX: processing=StableDiffusionProcessingTxt2Img, pipeline=OnnxStableDiffusionPipeline
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 19.25it/s]
ONNX: processing=StableDiffusionProcessingTxt2Img, pipeline=OnnxStableDiffusionPipeline
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 19.18it/s]

Additional information

is it losing any sign of feature after converstion it speed was correct after restart it speed was drop out

neonarc4 commented 6 months ago

@lshqqytiger https://www.youtube.com/watch?v=osf1VpUBdrg watch earlier part uncessary but i want record all part like converting model too

but u can watch from here https://youtu.be/osf1VpUBdrg?t=244 4:10- 5:53

difference and reason what cause it

lshqqytiger commented 6 months ago

Why did you close this issue?