Open ComplexRobot opened 11 months ago
Hi @ComplexRobot,
Thank you for bringing this to my attention.
A couple diagnostic questions:
--disable-safe-unpickle
commandline arg? On my device, this is necessary for loading the fastsam
model, not sure if it applies to clip_surgery
as well.[txt2mask]
outside of the [after]
block? e.g. a prompt such as [txt2mask method=clip_surgery]face[/txt2mask]walter white face
in img2img inpainting modeI also released a small patch, v10.1.5, that addresses a few potentially related problems. Feel free to give it a try.
I can confirm that clip_surgery
and fastsam
methods are working in this version, although I still find their performance quite disappointing compared to clipseg
.
Due diligence
Describe the bug
Using a
txt2mask
method
other thanclipseg
causes a crash. Same thing withzoomenhance
. See: #145Prompt
[after][txt2mask method=clip_surgery]face[/txt2mask][img2img][/after]a person
Log output
Unprompted version
10.1.4
WebUI version
1.6.0
Other comments
My googling of the error seems to suggest either an incompatible file or an incompatible version of torch. I changed the model file to one I've used with the Segment Anything extension successfully, and I get the same error. Updating the torch version breaks the webui with an error stating torch can't access the GPU and causes other dependency errors. I don't think it's a problem on my end, because I don't have problems with other extensions that use sam.
I'm not too knowledgeable on the subject, but is segment anything designed to take a text prompt as input? In the segment anything extension, it uses groundingdino to take a text prompt, and the output from sam just tries to infer the object within the bounding boxes generated by groundingdino. But, sam itself isn't taking text, it only takes bounding boxes and points as input. Maybe that's just how that extension works?