schananas / grounded_sam_replicate

Implementation of Grounding DINO & Segment Anything, and it allows masking based on prompt, which is useful for programmed inpainting.
https://replicate.com/schananas/grounded_sam
MIT License
34 stars 18 forks source link

Cannot reshape tensor of 0 elements into shape [0, -1, 256, 256] because the unspecified dimension size -1 can be any value and is ambiguous #1

Open whenmoon opened 12 months ago

whenmoon commented 12 months ago

Hi! Both the web form and API calls to Replicate fail when I run this model: https://replicate.com/schananas/grounded_sam

Running prediction: 3341693a-59dd-4e9d-b89f-2ba89f3ab53c...
Traceback (most recent call last):
File "/root/.pyenv/versions/3.10.13/lib/python3.10/site-packages/cog/server/worker.py", line 222, in _predict
for r in result:
File "/root/.pyenv/versions/3.10.13/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 43, in generator_context
response = gen.send(None)
File "/src/predict.py", line 78, in predict
annotated_picture_mask, neg_annotated_picture_mask, mask, inverted_mask = run_grounding_sam(image,
File "/src/grounded_sam.py", line 100, in run_grounding_sam
neg_segmented_frame_masks = segment(image_source, sam_predictor, boxes=neg_detected_boxes)
File "/src/grounded_sam.py", line 55, in segment
masks, _, _ = sam_model.predict_torch(
File "/root/.pyenv/versions/3.10.13/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/src/weights/segment-anything/segment_anything/predictor.py", line 229, in predict_torch
low_res_masks, iou_predictions = self.model.mask_decoder(
File "/root/.pyenv/versions/3.10.13/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/src/weights/segment-anything/segment_anything/modeling/mask_decoder.py", line 94, in forward
masks, iou_pred = self.predict_masks(
File "/src/weights/segment-anything/segment_anything/modeling/mask_decoder.py", line 144, in predict_masks
masks = (hyper_in @ upscaled_embedding.view(b, c, h * w)).view(b, -1, h, w)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 256, 256] because the unspecified dimension size -1 can be any value and is ambiguous
whenmoon commented 12 months ago

Having tested this again, it works only with the default photo on Replicate, any other photo, it fails. I've tried different formats, color profiles, sizes, compression levels, but alway hit this error either via the replicate web form or the Replicate API. I guess it has somthing to do with the source image but I can't figure out what.. Any help much appreciated!

schananas commented 12 months ago

Thanks for reporting, Il check it in following days

whenmoon commented 11 months ago

Thanks for reporting, Il check it in following days

Ok thanks! 👌🏽

smyja commented 11 months ago

I just tried it, same problem.

smyja commented 11 months ago

Having tested this again, it works only with the default photo on Replicate, any other photo, it fails. I've tried different formats, color profiles, sizes, compression levels, but alway hit this error either via the replicate web form or the Replicate API. I guess it has somthing to do with the source image but I can't figure out what.. Any help much appreciated!

I think i have figured out the problem, you have to define the correct negative prompt(items you don't want masked).

whenmoon commented 11 months ago

Having tested this again, it works only with the default photo on Replicate, any other photo, it fails. I've tried different formats, color profiles, sizes, compression levels, but alway hit this error either via the replicate web form or the Replicate API. I guess it has somthing to do with the source image but I can't figure out what.. Any help much appreciated!

I think i have figured out the problem, you have to define the correct negative prompt(items you don't want masked).

Hi, I'm not sure because the script has a default negative prompt and running it on Replicate doesn't fail if you remove it