[Feature Request]: torch: 2.2.0+cu121 + scaled_dot_product_attention ((SDPA) now supports FlashAttention-2) more productivity pytorch 2.2.0 and cuda 12.1

TimmekHW commented 7 months ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

torch: 2.2.0+cu121
more perfomance
FP8
slightly less memory consumption
scaled_dot_product_attention ((SDPA) now supports FlashAttention-2)

Proposed workflow

torch: 2.2.0+cu121

open launch_utils.py from stable-diffusion-webui\modules

line 315

def prepare_environment():
torch_index_url = os.environ.get('TORCH_INDEX_URL', "https://download.pytorch.org/whl/cu121")
torch_command = os.environ.get('TORCH_COMMAND', f"pip install torch==2.2.0 torchvision==0.17.0 --extra-index-url {torch_index_url}")
if args.use_ipex:
    if platform.system() == "Windows":
        # The "Nuullll/intel-extension-for-pytorch" wheels were built from IPEX source for Intel Arc GPU: https://github.com/intel/intel-extension-for-pytorch/tree/xpu-main
        # This is NOT an Intel official release so please use it at your own risk!!
        # See https://github.com/Nuullll/intel-extension-for-pytorch/releases/tag/v2.0.110%2Bxpu-master%2Bdll-bundle for details.
        #
        # Strengths (over official IPEX 2.0.110 windows release):
        #   - AOT build (for Arc GPU only) to eliminate JIT compilation overhead: https://github.com/intel/intel-extension-for-pytorch/issues/399
        #   - Bundles minimal oneAPI 2023.2 dependencies into the python wheels, so users don't need to install oneAPI for the whole system.
        #   - Provides a compatible torchvision wheel: https://github.com/intel/intel-extension-for-pytorch/issues/465
        # Limitation:
        #   - Only works for python 3.10
        url_prefix = "https://github.com/Nuullll/intel-extension-for-pytorch/releases/download/v2.1.10%2Bxpu-master%2Bdll-bundle"
        torch_command = os.environ.get('TORCH_COMMAND', f"pip install {url_prefix}/torch-2.1.0a0+cxx11.abi-cp311-cp311-win_amd64.whl {url_prefix}/torchvision-0.16.0a0+cxx11.abi-cp311-cp311-win_amd64.whl {url_prefix}/intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-win_amd64.whl")

delete the "venv" folder
implement scaled_dot_product_attention

Additional information

Also, in addition to scaled_dot_product_attention, I would like to update all old dependencies. requirements.txt: transformers==4.37.2

requirements_versions.txt

Pillow==10.2.0
accelerate==0.26.1
basicsr==1.4.2
blendmodes==2024.1
clean-fid==0.1.35
einops==0.4.1
fastapi==0.94.0
gfpgan==1.3.8
gradio==3.41.2
httpcore==0.15
inflection==0.5.1
jsonmerge==1.8.0
kornia==0.6.7
lark==1.1.2
numpy==1.26.3
omegaconf==2.3.0
open-clip-torch==2.24.0
piexif==1.1.3
psutil==5.9.5
pytorch_lightning==2.1.3
realesrgan==0.3.0
resize-right==0.0.2
safetensors==0.4.2
scikit-image==0.21.0
timm==0.9.2
tomesd==0.1.3
torch==2.2.0
torchdiffeq==0.2.3
torchsde==0.2.6
transformers==4.37.2
httpx==0.24.1

But to work you will need rename from pytorch_lightning.utilities.distributed import rank_zero_only in from pytorch_lightning.utilities.rank_zero import rank_zero_only sd_hijack_ddpm_v1.py , ddpm_edit.py and ddpm.py

TimmekHW commented 7 months ago

I've already done this, but I don't know how to run it with scaled_dot_product_attention

https://github.com/pytorch/pytorch/releases/tag/v2.2.0#:~:text=Summary%3A-,scaled_dot_product_attention,-(SDPA)%20now%20supports

and I have errors:

Already up to date.
venv "G:\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: v1.7.0
Commit hash: cf2772fab0af5573da775e7437e6acdca424f26e
Installing requirements
Launching Web UI with arguments: --port 4756 --api --listen
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
[-] ADetailer initialized. version: 23.11.0, num models: 9
2024-01-31 02:29:59,208 - ControlNet - INFO - ControlNet v1.1.415
ControlNet preprocessor location: G:\stable-diffusion-webui\extensions\sd-webui-controlnet\annotator\downloads
2024-01-31 02:29:59,286 - ControlNet - INFO - ControlNet v1.1.415
Loading weights [d48c2391e0] from G:\stable-diffusion-webui\models\Stable-diffusion\aamXLAnimeMix_v10.safetensors
Creating model from config: G:\stable-diffusion-webui\repositories\generative-models\configs\inference\sd_xl_base.yaml
Running on local URL:  http://0.0.0.0:4756
creating model quickly: OSError
Traceback (most recent call last):
  File "G:\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\utils\_errors.py", line 286, in hf_raise_for_status
    response.raise_for_status()
  File "G:\stable-diffusion-webui\venv\lib\site-packages\requests\models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/None/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "G:\stable-diffusion-webui\venv\lib\site-packages\transformers\utils\hub.py", line 385, in cached_file
    resolved_file = hf_hub_download(
  File "G:\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "G:\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 1368, in hf_hub_download
    raise head_call_error
  File "G:\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 1238, in hf_hub_download
    metadata = get_hf_file_metadata(
  File "G:\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "G:\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 1631, in get_hf_file_metadata
    r = _request_wrapper(
  File "G:\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 385, in _request_wrapper
    response = _request_wrapper(
  File "G:\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 409, in _request_wrapper
    hf_raise_for_status(response)
  File "G:\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\utils\_errors.py", line 323, in hf_raise_for_status
    raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-65b98678-0d2facf40e7fde94188fd284;e6ae69cb-81fd-4cde-a5a7-ceb2b32ddc32)

Repository Not Found for url: https://huggingface.co/None/resolve/main/config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Timmek\AppData\Local\Programs\Python\Python310\lib\threading.py", line 973, in _bootstrap
    self._bootstrap_inner()
  File "C:\Users\Timmek\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "C:\Users\Timmek\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "G:\stable-diffusion-webui\modules\initialize.py", line 147, in load_model
    shared.sd_model  # noqa: B018
  File "G:\stable-diffusion-webui\modules\shared_items.py", line 128, in sd_model
    return modules.sd_models.model_data.get_sd_model()
  File "G:\stable-diffusion-webui\modules\sd_models.py", line 531, in get_sd_model
    load_model()
  File "G:\stable-diffusion-webui\modules\sd_models.py", line 634, in load_model
    sd_model = instantiate_from_config(sd_config.model)
  File "G:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\util.py", line 89, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "G:\stable-diffusion-webui\repositories\generative-models\sgm\models\diffusion.py", line 61, in __init__
    self.conditioner = instantiate_from_config(
  File "G:\stable-diffusion-webui\repositories\generative-models\sgm\util.py", line 175, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "G:\stable-diffusion-webui\repositories\generative-models\sgm\modules\encoders\modules.py", line 88, in __init__
    embedder = instantiate_from_config(embconfig)
  File "G:\stable-diffusion-webui\repositories\generative-models\sgm\util.py", line 175, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "G:\stable-diffusion-webui\repositories\generative-models\sgm\modules\encoders\modules.py", line 361, in __init__
    self.transformer = CLIPTextModel.from_pretrained(version)
  File "G:\stable-diffusion-webui\modules\sd_disable_initialization.py", line 68, in CLIPTextModel_from_pretrained
    res = self.CLIPTextModel_from_pretrained(None, *model_args, config=pretrained_model_name_or_path, state_dict={}, **kwargs)
  File "G:\stable-diffusion-webui\venv\lib\site-packages\transformers\modeling_utils.py", line 2926, in from_pretrained
    resolved_config_file = cached_file(
  File "G:\stable-diffusion-webui\venv\lib\site-packages\transformers\utils\hub.py", line 406, in cached_file
    raise EnvironmentError(
OSError: None is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

Failed to create model quickly; will retry using slow method.

To create a public link, set `share=True` in `launch()`.
Startup time: 18.5s (prepare environment: 6.5s, import torch: 2.6s, import gradio: 0.7s, setup paths: 1.3s, initialize shared: 0.2s, other imports: 0.6s, setup codeformer: 0.1s, list SD models: 0.1s, load scripts: 1.6s, create ui: 0.5s, gradio launch: 4.2s).
Loading VAE weights specified in settings: G:\stable-diffusion-webui\models\VAE\1sdxl_vae.safetensors
Applying attention optimization: sdp... done.
Model loaded in 22.0s (load weights from disk: 0.5s, create model: 8.7s, apply weights to model: 11.0s, apply half(): 0.2s, load VAE: 0.7s, move model to device: 0.1s, load textual inversion embeddings: 0.3s, calculate empty prompt: 0.5s).

xddun commented 2 months ago

https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/8367

The PyTorch version is different. PyTorch 2.2 has adopted SDPA, which means if you install PyTorch 2.2 and enable the --opt-sdp-attention option, it will be enabled by default. You won't need to develop it yourself anymore, right?

TimmekHW commented 2 months ago

8367

The PyTorch version is different. PyTorch 2.2 has adopted SDPA, which means if you install PyTorch 2.2 and enable the --opt-sdp-attention option, it will be enabled by default. You won't need to develop it yourself anymore, right?

Нет. Я просто хочу, что создатель автоматик1111 обновил версии библиотек. В частности PyTorch до 2.3.1 или хотя бы до 2.2.

Есть более новый метод ускорения, чем SDPA

xddun commented 2 months ago

Есть ли более быстрый метод ускорения? Какой именно метод?

xddun commented 2 months ago

Is there a faster acceleration method? What is it? Actually, I'm currently struggling. I'm working on accelerating IPadapter in ControlNet, and it's very difficult. If you have any good ideas, could you share some?

AUTOMATIC1111 / stable-diffusion-webui