mcmonkeyprojects / SwarmUI

SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.
MIT License
1.35k stars 97 forks source link

Cannot segment anything #379

Closed TheForgotten69 closed 1 week ago

TheForgotten69 commented 1 week ago

Expected Behavior

Being able to use the capability, with or without a specific model

Actual Behavior

ComfyUI execution error: Input image size (352352) doesn't match model (224224).

Steps to Reproduce

Just try to use the segment capability on latest comfyui version and swarmUI

Debug Logs

00:43:24.125 [Warning] [ComfyUI-0/STDOUT] Traceback (most recent call last): 00:43:24.125 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\execution.py", line 323, in execute 00:43:24.125 [Warning] [ComfyUI-0/STDOUT] output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) 00:43:24.125 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\execution.py", line 198, in get_output_data 00:43:24.125 [Warning] [ComfyUI-0/STDOUT] return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) 00:43:24.125 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\execution.py", line 169, in _map_node_over_list 00:43:24.125 [Warning] [ComfyUI-0/STDOUT] process_inputs(input_dict, i) 00:43:24.125 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\execution.py", line 158, in process_inputs 00:43:24.125 [Warning] [ComfyUI-0/STDOUT] results.append(getattr(obj, func)(inputs)) 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\SwarmUI\src\BuiltinExtensions\ComfyUIBackend\ExtraNodes\SwarmComfyCommon\SwarmClipSeg.py", line 78, in seg 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] mask = model(processor(text=match_text, images=[i], return_tensors="pt", padding=True))[0] 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] return self._call_impl(*args, kwargs) 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] return forward_call(*args, *kwargs) 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\venv\lib\site-packages\transformers\models\clipseg\modeling_clipseg.py", line 1436, in forward 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] vision_outputs = self.clip.vision_model( 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] return self._call_impl(args, kwargs) 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] return forward_call(*args, kwargs) 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\venv\lib\site-packages\transformers\models\clipseg\modeling_clipseg.py", line 870, in forward 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] hidden_states = self.embeddings(pixel_values, interpolate_pos_encoding=interpolate_pos_encoding) 00:43:24.126 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl 00:43:24.127 [Warning] [ComfyUI-0/STDOUT] return self._call_impl(*args, *kwargs) 00:43:24.127 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl 00:43:24.127 [Warning] [ComfyUI-0/STDOUT] return forward_call(args, kwargs) 00:43:24.127 [Warning] [ComfyUI-0/STDOUT] File "G:\Tools\ML\Visual\StabilityMatrix\Packages\ComfyUI\venv\lib\site-packages\transformers\models\clipseg\modeling_clipseg.py", line 211, in forward 00:43:24.127 [Warning] [ComfyUI-0/STDOUT] raise ValueError( 00:43:24.127 [Warning] [ComfyUI-0/STDOUT] ValueError: Input image size (352352) doesn't match model (224224).

Other

Raised as well in ComfyUI https://github.com/comfyanonymous/ComfyUI/issues/5402

mcmonkey4eva commented 1 week ago

I'm not sure how to install pip packages manually within Matrix, but you need to pip install transformers==4.45.0 as per my comment on the linked comfy issue (this is a bug in the current Transformers lib version, it needs to be backdated until they push a fix)

In Swarm itself you'd open a terminal in (Swarm)/dlbackend/comfy and run python_embeded\python.exe -s -m pip install transformers==4.45.0 (Though again, not the same for matrix)

TheForgotten69 commented 1 week ago

Thanks @mcmonkey4eva! works perfectly. In case someone read this ticket, StabilityMatrix has a Python Dependency Override where the transformers version should be specified like said above