ItayElam / SegmentAnything-TensorRT

31 stars 1 forks source link

Issue with Direct Boxes Inputs and Error Encountered #1

Closed Liu-Jinxin closed 1 year ago

Liu-Jinxin commented 1 year ago

First and foremost, I'd like to express my gratitude to the author for providing the acceleration code. I'm reaching out to inquire if the provided code supports direct box inputs. From my perspective, the code should be able to handle boxes inputs directly. However, during several attempts, I encountered an error. I'm wondering if any modifications were made during the acceleration process that might have affected this functionality.

Below is the error log I received:

/root/miniconda3/envs/ros/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1682343964576/work/aten/src/ATen/native/TensorShape.cpp:3483.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
final text_encoder_type: bert-base-uncased
_IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight', 'bert.embeddings.position_ids'])
/root/miniconda3/envs/ros/lib/python3.9/site-packages/transformers/modeling_utils.py:909: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
/root/miniconda3/envs/ros/lib/python3.9/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
[09/18/2023-14:44:57] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
Traceback (most recent call last):
  File "/home/appusr/semantic_pointcloud_ws/src/grounded_sam/script/sam_tensorrt_test.py", line 259, in <module>
    masks, _, _ = predictor.predict(
  File "/root/miniconda3/envs/ros/lib/python3.9/site-packages/segment_anything/predictor.py", line 154, in predict
    box = self.transform.apply_boxes(box, self.original_size)
  File "/root/miniconda3/envs/ros/lib/python3.9/site-packages/segment_anything/utils/transforms.py", line 52, in apply_boxes
    boxes = self.apply_coords(boxes.reshape(-1, 2, 2), original_size)
  File "/root/miniconda3/envs/ros/lib/python3.9/site-packages/segment_anything/utils/transforms.py", line 42, in apply_coords
    coords = deepcopy(coords).astype(float)
AttributeError: 'Tensor' object has no attribute 'astype'
ItayElam commented 1 year ago

Thank you so much for your kind words and for bringing this to my attention. I truly appreciate your feedback.

To clarify, did you use the "box" argument when calling the predict function? I will look into the issue as it should definitely support that.

Again, thanks for reaching out, and I'll do my best to resolve this as soon as possible.

Liu-Jinxin commented 1 year ago

Thank you so much for your kind words and for bringing this to my attention. I truly appreciate your feedback.

To clarify, did you use the "box" argument when calling the predict function? I will look into the issue as it should definitely support that.

Again, thanks for reaching out, and I'll do my best to resolve this as soon as possible.

Thanks for your reply. I use the following code to run inference.

    masks, _, _ = predictor.predict(
        point_coords = None,
        point_labels = None,
        box = transformed_boxes.numpy(),
        multimask_output = False,
    )
ItayElam commented 1 year ago

Thank you for the snippet. I'll take a look at it and update you as soon as I can.

ItayElam commented 1 year ago

I've been trying to reproduce your issue for a while, and couldn't until I encountered an unexpected behavior I can't explain. I added bounding box support to the infer function, and that's when I encountered the error you mentioned.

if input_data.shape[-1] == 4:
    input_box = input_data
    if isinstance(input_box, torch.Tensor):
        input_box = input_box.numpy()
    input_label = None
    input_point = None
else:
    input_box = None
    input_point = input_data
masks, scores, logits = self.predictor.predict(point_coords=input_point,
                                               point_labels=input_label,
                                               box=input_box,
                                               multimask_output=False)

While investigating, I found that if isinstance(input_data, torch.Tensor): fails. I am unable to explain this at the moment because once I changed the code to:

if isinstance(input_data, torch.Tensor):
    input_data = input_data.numpy()
if input_data.shape[-1] == 4:
    input_box = input_data
    input_label = None      
    input_point = None
else:
    input_box = None
    input_point = input_data
masks, scores, logits = self.predictor.predict(point_coords=input_point,
                                               point_labels=input_label,
                                               box=input_box,
                                               multimask_output=False)

Everything worked. I suggest saving transformed_boxes.numpy() to a variable and ensuring it's indeed a numpy array before passing it to predict.

Please let me know if this solved the issue or any more help is required.

I've also uploaded the updated code to support bounding boxes. Thank you!

Liu-Jinxin commented 1 year ago

Thank you so much for the detailed explanation and assistance! I've implemented the changes you suggested, and it seems to have resolved the issue.

ItayElam commented 1 year ago

I'm pleased to hear that! If you have any further questions or face any other issues in the future, don't hesitate to reach out. Thank you!