luca-medeiros / lang-segment-anything

SAM with text prompt
Apache License 2.0
1.53k stars 167 forks source link

TypeError: predict() got an unexpected keyword argument 'remove_combined' #64

Open kulkarnikeerti opened 2 months ago

kulkarnikeerti commented 2 months ago

Hello,

First of all thanks for the great work. I ranpython3 running_test.py and I get the following error TypeError: predict() got an unexpected keyword argument 'remove_combined'

Could you please suggest the possible solution for this? Thanks in advance

@healthonrails @kauevestena @mutusfa @siddharthksah @dolhasz

Update: The script works fine without this argument. Could you please let me know how this argument influences the output of the model?

kauevestena commented 2 months ago

Thanks for the interest!

But I couldn't find this argument in the code; in which line does that appear?

kulkarnikeerti commented 2 months ago

https://github.com/luca-medeiros/lang-segment-anything/blob/831cdc10906a23aabe4591a4be06f4c989e4ee30/lang_sam/lang_sam.py#L98

Please find the line here

kauevestena commented 2 months ago

The default constructor states it as false, so it doesn't seems to make a difference:

https://github.com/IDEA-Research/GroundingDINO/blob/df5b48a3efbaa64288d8d0ad09b748ac86f22671/groundingdino/util/inference.py#L53

luca-medeiros commented 2 months ago

you may need to upgrade gdino to latest version to solve the issue.

bryanbocao commented 1 month ago

@kulkarnikeerti

I got the same error but did not upgrade GroundingDINO yet. A quick walkaround solution is to comment the remove_combinedargument out (not tested the side effect)

    def predict_dino(self, image_pil, text_prompt, box_threshold, text_threshold):
        image_trans = transform_image(image_pil)
        boxes, logits, phrases = predict(model=self.groundingdino,
                                         image=image_trans,
                                         caption=text_prompt,
                                         box_threshold=box_threshold,
                                         text_threshold=text_threshold,
                                         # remove_combined=self.return_prompts,
                                         device=self.device)
        W, H = image_pil.size
        boxes = box_ops.box_cxcywh_to_xyxy(boxes) * torch.Tensor([W, H, W, H])

        return boxes, logits, phrases