luca-medeiros / lang-segment-anything

SAM with text prompt
Apache License 2.0
1.53k stars 167 forks source link

Fix possible pytorch exception #60

Open StevePotter opened 3 months ago

StevePotter commented 3 months ago

Describe your changes and approach used

I did not file an issue, and instead just made a PR when I encountered the problem.

I am using this great model in conjunction with a few other pytorch models. I ran LangSAM first, then the other model. That worked fine. But when I run LangSAM, it failed with the following stack trace:

    masks, _, phrases, _ = self.langSam.predict(rbg_image, segment_prompt)
  File "/opt/conda/envs/my/lib/python3.8/site-packages/lang_sam/lang_sam.py", line 119, in predict
    boxes, logits, phrases = self.predict_dino(image_pil, text_prompt, box_threshold, text_threshold)
  File "/opt/conda/envs/my/lib/python3.8/site-packages/lang_sam/lang_sam.py", line 102, in predict_dino
    boxes = box_ops.box_cxcywh_to_xyxy(boxes) * torch.Tensor([W, H, W, H])
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

I did some investigation and the problem comes from the boxes from the GroundingDINO predict function are CPU but the torch.Tensor was using cuda. So the fix was simply to use the device from boxes in the torch.Tensor call.

I tested and it works fine. Now I can use multiple models in the same process without trouble. Thanks!

Checklist before requesting a review