(Amd and Intel) torch_directml - Githubissues

light-and-ray / sd-webui-replacer

A tab for sd-webui for replacing objects in pictures or videos using detection prompt

193 stars 11 forks source link

(Amd and Intel) torch_directml #15

Closed yacinesh closed 6 months ago

yacinesh commented 7 months ago

My error: [F D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\engine\dml_tensor_desc.cc:135] Check failed: !is_dim_broadcast || non_broadcast_dim_size == 1 Is there a way to make it work on Amd card?

StudioDUzes commented 7 months ago

My error: [F D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\engine\dml_tensor_desc.cc:135] Check failed: !is_dim_broadcast || non_broadcast_dim_size == 1 Is there a way to make it work on Amd card?

or Intel Arc A770 16go...

light-and-ray commented 7 months ago

I don't have these gpus. But I heard amd builds on wolindows work better via wsl. Maybe you should try it, or install Linux on external drive, and test it

light-and-ray commented 7 months ago

I added cpu option. Please can test it, does it work now for AMD and Intel GPUs?

https://github.com/light-and-ray/sd-webui-replacer?tab=readme-ov-file#amd-radeon-and-intel-arc

StudioDUzes commented 7 months ago

I added cpu option. Please can test it, does it work now for AMD and Intel GPUs?

https://github.com/light-and-ray/sd-webui-replacer?tab=readme-ov-file#amd-radeon-and-intel-arc

Intel Arc 16go... RuntimeError: The GPU device does not support Double (Float64) operations!

light-and-ray commented 7 months ago

Can you send full console log to know on which stage it happens

Maybe it because of grounding Dino models, it doesn't have CPU mode in continue-revolution's extension

You also can test segment anything with no Dino, using only dot-prompting in segment anything according in txt2img tab

light-and-ray commented 7 months ago

Also you can run webui in CPU mode and test, does it works. If it does, I can override global webui mode only for detection stage

light-and-ray commented 7 months ago

I can override global webui mode only for detection stage

It wasn't too hard, I did it. Can you test new option? @StudioDUzes

light-and-ray commented 7 months ago

Btw does regular intaint tab work for Intel ARC?

light-and-ray commented 7 months ago

Screenshot_20240204_130949

StudioDUzes commented 7 months ago

Already up to date. venv "N:\1.7\stable-diffusion-webui-directml\venv\Scripts\Python.exe" fatal: No names found, cannot describe anything. Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] Version: 1.7.0 Commit hash: d500e58a65d99bfaa9c7bb0da6c3eb5704fadf25 Launching Web UI with arguments: --skip-torch-cuda-test --use-directml --device-id 1 --port 7861 --medvram --always-batch-cond-uncond --no-half --precision full --no-half-vae --disable-nan-check --use-cpu interrogate codeformer --api --autolaunch --ckpt-dir N:\Models SDXL --lora-dir N:\Lora SDXL --vae-dir N:\VAE --esrgan-models-path N:\ESRGAN --opt-sub-quad-attention --opt-split-attention no module 'xformers'. Processing without... no module 'xformers'. Processing without... No module 'xformers'. Proceeding without it. Warning: caught exception 'Something went wrong.', memory monitor disabled *** Extension "sd-webui-lama-cleaner-masked-content" requires "sd-webui-controlnet" which is disabled. Civitai Helper: Get Custom Model Folder [-] ADetailer initialized. version: 24.1.2, num models: 9 ControlNet preprocessor location: N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\annotator\downloads 2024-02-04 10:11:31,640 - ControlNet - INFO - ControlNet v1.1.438 2024-02-04 10:11:31,772 - ControlNet - INFO - ControlNet v1.1.438 10:11:32 - ReActor - STATUS - Running v0.6.1-b2 on Device: CPU Loading weights [0ad6b50e22] from N:\Models SDXL\SDXL\cineroXLPhotomatic_v14RC5Bokelicious.safetensors Civitai Helper: Settings: Civitai Helper: max_size_preview: True Civitai Helper: skip_nsfw_preview: False Civitai Helper: open_url_with_js: True Civitai Helper: proxy: Civitai Helper: use civitai api key: False 2024-02-04 10:11:33,753 - ControlNet - INFO - ControlNet UI callback registered. Running on local URL: http://127.0.0.1:7861

To create a public link, set share=True in launch(). Creating model from config: N:\1.7\stable-diffusion-webui-directml\repositories\generative-models\configs\inference\sd_xl_base.yaml Startup time: 16.8s (prepare environment: 3.1s, import torch: 2.8s, import gradio: 1.0s, setup paths: 0.8s, initialize shared: 0.9s, other imports: 0.5s, load scripts: 4.9s, create ui: 1.3s, gradio launch: 0.9s, app_started_callback: 0.2s). Applying attention optimization: sdp... done. Model loaded in 14.5s (load weights from disk: 2.2s, create model: 2.7s, apply weights to model: 1.7s, apply float(): 1.7s, calculate empty prompt: 6.2s). [Replacer] Use CPU for detection Start SAM Processing Using local groundingdino. Running GroundingDINO Inference Initializing GroundingDINO GroundingDINO_SwinT_OGC (694MB) final text_encoder_type: bert-base-uncased [Replacer] Exception: The GPU device does not support Double (Float64) operations! Error completing request Arguments: ('task(62zodo82yn9oam9)', 'hairstyle', '', 'photo of blonde girl', 'poor quality, low quality, low res', 0.0, <PIL.Image.Image image mode=RGBA size=500x261 at 0x1C4DAF2EEC0>, None, True, '', '', True, False, '', '', 10, '', -1, 'DPM++ 2M SDE Karras', 20, 0.3, 35, 4, 1280, 'sam_hq_vit_l.pth', 'GroundingDINO_SwinT_OGC (694MB)', 5.5, 1, 40, 0, 512, 1, 512, 1, 0, ['script'], False, False, 'SDXL INPAINT\sdxlInpainting01Official_v01-inpainting.safetensors [6470840731]', 'Random', ['Draw mask'], None, True, ['Draw mask'], None, UiControlNetUnit(enabled=False, module='none', model='None', weight=1.0, image=None, resize_mode=<ResizeMode.INNER_FIT: 'Crop and Resize'>, low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode=<ControlMode.BALANCED: 'Balanced'>, inpaint_crop_input_image=True, hr_option=<HiResFixOption.BOTH: 'Both'>, save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1.0, image=None, resize_mode=<ResizeMode.INNER_FIT: 'Crop and Resize'>, low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode=<ControlMode.BALANCED: 'Balanced'>, inpaint_crop_input_image=True, hr_option=<HiResFixOption.BOTH: 'Both'>, save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1.0, image=None, resize_mode=<ResizeMode.INNER_FIT: 'Crop and Resize'>, low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode=<ControlMode.BALANCED: 'Balanced'>, inpaint_crop_input_image=True, hr_option=<HiResFixOption.BOTH: 'Both'>, save_detected_map=True, advanced_weighting=None)) {} Traceback (most recent call last): File "N:\1.7\stable-diffusion-webui-directml\modules\call_queue.py", line 57, in f res = list(func(*args, kwargs)) File "N:\1.7\stable-diffusion-webui-directml\modules\call_queue.py", line 36, in f res = func(*args, *kwargs) File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\generate.py", line 561, in generate_webui return generate(args, kwargs) File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\generate.py", line 409, in generate processed, extraImages = generateSingle(image, gArgs, saveDir, "", save_to_dirs, File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\generate.py", line 149, in generateSingle masksCreator = MasksCreator(gArgs.detectionPrompt, gArgs.avoidancePrompt, image, gArgs.samModel, File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\mask_creator.py", line 62, in init self._createMasks() File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\mask_creator.py", line 79, in _createMasks masks, samLog = sam_predict(self.samModel, imageResized, [], [], True, File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\scripts\sam.py", line 200, in sam_predict boxes_filt, install_success = dino_predict_internal(input_image, dino_model_name, text_prompt, box_threshold) File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\scripts\dino.py", line 176, in dino_predict_internal boxes_filt = get_grounding_output( File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\scripts\dino.py", line 153, in get_grounding_output outputs = model(image[None], captions=[caption]) File "N:\1.7\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\local_groundingdino\models\GroundingDINO\groundingdino.py", line 253, in forward bert_output = self.bert(tokenized_for_encoder) # bs, 195, 768 File "N:\1.7\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "N:\1.7\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\local_groundingdino\models\GroundingDINO\bertwarper.py", line 105, in forward extended_attention_mask: torch.Tensor = self.get_extended_attention_mask( File "N:\1.7\stable-diffusion-webui-directml\venv\lib\site-packages\transformers\modeling_utils.py", line 912, in get_extended_attention_mask extended_attention_mask = (1.0 - extended_attention_mask) torch.finfo(dtype).min File "N:\1.7\stable-diffusion-webui-directml\venv\lib\site-packages\torch_tensor.py", line 40, in wrapped return f(args, kwargs) File "N:\1.7\stable-diffusion-webui-directml\venv\lib\site-packages\torch_tensor.py", line 848, in rsub return _C._VariableFunctions.rsub(self, other) RuntimeError: The GPU device does not support Double (Float64) operations!

light-and-ray commented 7 months ago

Oops, I've mixed them up. Try both checkboxes enabled

light-and-ray commented 7 months ago

Yes, according to your log, it on dino stage

light-and-ray commented 7 months ago

Unfortunately this override doesn't work because segment anything extension copies device name on application running, Maybe I can make it work with patching segment anything

light-and-ray commented 7 months ago

I have a laptop with potato gpu where I can test it. And I can make it works only on cpu! But there is problem, for first run it gives an error. But for other runs it works well!

light-and-ray commented 7 months ago

I will make a PR in segment anything extention

light-and-ray commented 7 months ago

I made it. Hope continue-revolution will merge it.

https://github.com/continue-revolution/sd-webui-segment-anything/pull/189

But you can download it now https://github.com/continue-revolution/sd-webui-segment-anything/pull/189.diff, and run git apply <path to diff file> inside sd-webui-segment-anything extension directory

light-and-ray commented 7 months ago

Hm, now it worked for first time well. Maybe it's the best solution

light-and-ray commented 7 months ago

@yacinesh can you test it for amd?

yacinesh commented 7 months ago

@light-and-ray I'm still getting this problem "[Replacer] Use CPU for SAM Start SAM Processing Using local groundingdino. Running GroundingDINO Inference Initializing GroundingDINO GroundingDINO_SwinT_OGC (694MB) final text_encoder_type: bert-base-uncased [F D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\engine\dml_tensor_desc.cc:135] Check failed: !is_dim_broadcast || non_broadcast_dim_size == 1"

light-and-ray commented 7 months ago

I understand. You also need to use the same method, need to wait continue revolution merge the PR above

yacinesh commented 7 months ago

I didn't get you, what i need to do ? can you check if i'm doing everything right ?

light-and-ray commented 7 months ago

He merged!

You need to:

Update segment anything extension
Enable 2 checkboxes in replaces options, and click "apply"
Restart webui completely

yacinesh commented 7 months ago

i get this [Replacer] Use CPU for detection [Replacer] Use CPU for SAM Start SAM Processing Using local groundingdino. Running GroundingDINO Inference Initializing GroundingDINO GroundingDINO_SwinT_OGC (694MB) final text_encoder_type: bert-base-uncased Initializing SAM to cpu [Replacer] Exception: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU. Error completing request Arguments: ('task(voemi7liklcvxxo)', 'shirt', '', 'red shirt', '', 0.0, <PIL.Image.Image image mode=RGBA size=400x776 at 0x1E1F73CAB30>, None, True, '', '', True, False, '', '', 10, '', -1, 'DPM++ 2M SDE Karras', 20, 0.3, 35, 4, 1024, 'sam_hq_vit_b.pth', 'GroundingDINO_SwinT_OGC (694MB)', 5.5, 1, 20, 0, 512, 1, 512, 1, 0, ['script'], False, False, 'epicphotogasm_z-inpainting.ckpt [6eaca584d8]', 'Random', ['Draw mask'], None, True, ['Draw mask'], None) {} Traceback (most recent call last): File "C:\a1111\stable-diffusion-webui-directml\modules\call_queue.py", line 57, in f res = list(func(*args, kwargs)) File "C:\a1111\stable-diffusion-webui-directml\modules\call_queue.py", line 36, in f res = func(*args, *kwargs) File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\generate.py", line 574, in generate_webui return generate(args, kwargs) File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\generate.py", line 417, in generate processed, extraImages = generateSingle(image, gArgs, saveDir, "", save_to_dirs, File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\generate.py", line 153, in generateSingle masksCreator = MasksCreator(gArgs.detectionPrompt, gArgs.avoidancePrompt, image, gArgs.samModel, File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\mask_creator.py", line 72, in init self._createMasks() File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-replacer\replacer\mask_creator.py", line 92, in _createMasks masks, samLog = sam_predict(self.samModel, imageResized, [], [], True, File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\scripts\sam.py", line 204, in sam_predict sam = init_sam_model(sam_model_name) File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\scripts\sam.py", line 129, in init_sam_model sam_model_cache[sam_model_name] = load_sam_model(sam_model_name) File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\scripts\sam.py", line 80, in load_sam_model sam = sam_model_registrymodel_type File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\sam_hq\build_sam_hq.py", line 39, in build_sam_hq_vit_b return _build_sam_hq( File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\sam_hq\build_sam_hq.py", line 122, in _build_sam_hq return _load_sam_checkpoint(sam, checkpoint) File "C:\a1111\stable-diffusion-webui-directml\extensions\sd-webui-segment-anything\sam_hq\build_sam_hq.py", line 67, in _load_sam_checkpoint state_dict = torch.load(f) File "C:\a1111\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 809, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "C:\a1111\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1172, in _load result = unpickler.load() File "C:\Users\Yacine\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1213, in load dispatchkey[0] File "C:\Users\Yacine\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1254, in load_binpersid self.append(self.persistent_load(pid)) File "C:\a1111\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1142, in persistent_load typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location)) File "C:\a1111\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1116, in load_tensor wrap_storage=restore_location(storage, location), File "C:\a1111\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 217, in default_restore_location result = fn(storage, location) File "C:\a1111\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize device = validate_cuda_device(location) File "C:\a1111\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device raise RuntimeError('Attempting to deserialize object on a CUDA ' RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

light-and-ray commented 7 months ago

Is it stable, did you try few times?

light-and-ray commented 7 months ago

Maybe it's this https://stackoverflow.com/a/68992197

Can you try other Sam models?

light-and-ray commented 7 months ago

Try non hq model: https://github.com/light-and-ray/sd-webui-replacer?tab=readme-ov-file#sam-models-list

light-and-ray commented 7 months ago

Yes, it looks like this bug should be only for HQ and Mobile SAM, and it should work for regular sam. If you confirm it works, I will make another PR into segment anything

yacinesh commented 7 months ago

it works finally with no hq model

light-and-ray commented 7 months ago

Nice! Made PR:

https://github.com/continue-revolution/sd-webui-segment-anything/pull/190

yacinesh commented 7 months ago

do i need to wait until continue revolution merge the PR ?

light-and-ray commented 7 months ago

Yes, I think he will merge it recently.

If you don't want to wait, you can download it: https://github.com/continue-revolution/sd-webui-segment-anything/pull/190.diff Then run git apply <path to the diff file> inside segment anything directory

light-and-ray commented 7 months ago

He merged! Can you update and test it? @yacinesh

yacinesh commented 7 months ago

sam_hq_vit_b.pth work perfectly . we're done i think ?

light-and-ray commented 7 months ago

I think @StudioDUzes will test on Intel ARC, and I will close the issue. @StudioDUzes, Can you?

yacinesh commented 7 months ago

okay, thank you for this incredible project

light-and-ray commented 7 months ago

@StudioDUzes does it work now for Intel?