WASasquatch / was-node-suite-comfyui

An extensive node suite for ComfyUI with over 210 new nodes
MIT License
1.22k stars 177 forks source link

SAM nodes partially broken #61

Closed rvion closed 1 year ago

rvion commented 1 year ago

SAM nodes are partially broken

🔴 downloads in the wrong folder

for vit_b, sam loader node seems to be looking in ComfyUI\ComfyUI\models\sam instead of ComfyUI\models\sams where it downloads it

🔴 tensor size mismatch for vit_b and vit_l

if I fix paths manually for the Vit-B, I get things like

 size mismatch for image_encoder.blocks.10.mlp.lin2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1280]).
        size mismatch for image_encoder.blocks.11.norm1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1280]).
        size mismatch for image_encoder.blocks.11.norm1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1280]).
        size mismatch for image_encoder.blocks.11.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 80]).

same with vit_l

🟢 got vit_h working with manual path fixing

WASasquatch commented 1 year ago

I fixed the path issues, but I am not sure about the tensor mismatch. @moorehousew do you know what this is about? I think it's the tensor to sam compatible tensor conversion, but also not sure why their models are working on different tensors for different models. Seems odd from a usage standpoint.

Only thing I gather from this is the error message suggests that there are some parameter sizes in the image_encoder module that do not match the sizes of the corresponding parameters in the checkpoint model. And we don't really have access to that code without making our own fork? Or?

It does look like it's saying it doesn't support 768 (H?) images, and expects 1280 (H?) in size.

Tengyang-Chen commented 1 year ago

cause in build_sam.py

build_sam = build_sam_vit_h

It always returns vit_h, that's why only vit_h is working. U can add some if ...else... and make model_size to control which one should be returned.

FYI model_size can be found at WAS_Node_Suite.py: line 10685, build_sam.py is in repos/SAM/segment_anything/build_sam.py. As we can see files within SAM is downloaded from another repo, so it's better to modify at WAS_Node_Suite.py: line 10726 rather than in build_sam.py from segment_anything import build_sam_vit_h, build_sam_vit_l, build_sam_vit_s
if ... else ... :)