automatic_label_simple_demo.py RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 256, 256] because the unspecified dimension size -1 can be any value and is ambiguous #423
Hi, i got an error as:
Traceback (most recent call last):
File "automatic_label_ramdemo.py", line 303, in
masks, , _ = predictor.predict_torch(
File "/home/anaconda3/envs/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "/home/segment-anything-main/segment_anything/predictor.py", line 229, in predict_torch
low_res_masks, iou_predictions = self.model.mask_decoder(
File "/home/anaconda3/envs/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(input, *kwargs)
File "/home/segment-anything-main/segment_anything/modeling/mask_decoder.py", line 94, in forward
masks, iou_pred = self.predict_masks(
File "/home/segment-anything-main/segment_anything/modeling/mask_decoder.py", line 144, in predict_masks
masks = (hyper_in @ upscaled_embedding.view(b, c, h w)).view(b, -1, h, w)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 256, 256] because the unspecified dimension size -1 can be any value and is ambiguous
boxes_filt and logits_filt are tensor([], size=(0,255)) in function get_grounding_output after running automatic_label_simple_demo.py, the function is
print(f"Before NMS: {boxes_filt.shape[0]} boxes") --> get 0 boxes
nms_idx = torchvision.ops.nms(boxes_filt, scores, iou_threshold).numpy().tolist()
boxes_filt = boxes_filt[nms_idx]
pred_phrases = [pred_phrases[idx] for idx in nms_idx]
print(f"After NMS: {boxes_filt.shape[0]} boxes") --> get 0 boxes
----------- more information -------------------------
there are many -inf in the output of grounding model: with torch.no_grad(): outputs = model(image[None], captions=[caption])
where caption is 'building, person, illuminate, man, neon light, night, night view, red, retail, sign, signage, store, storefront, writing.'
Hi, i got an error as: Traceback (most recent call last): File "automatic_label_ramdemo.py", line 303, in
masks, , _ = predictor.predict_torch(
File "/home/anaconda3/envs/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "/home/segment-anything-main/segment_anything/predictor.py", line 229, in predict_torch
low_res_masks, iou_predictions = self.model.mask_decoder(
File "/home/anaconda3/envs/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(input, *kwargs)
File "/home/segment-anything-main/segment_anything/modeling/mask_decoder.py", line 94, in forward
masks, iou_pred = self.predict_masks(
File "/home/segment-anything-main/segment_anything/modeling/mask_decoder.py", line 144, in predict_masks
masks = (hyper_in @ upscaled_embedding.view(b, c, h w)).view(b, -1, h, w)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 256, 256] because the unspecified dimension size -1 can be any value and is ambiguous
boxes_filt
andlogits_filt
aretensor([], size=(0,255))
in functionget_grounding_output
after running automatic_label_simple_demo.py, the function isboxes_filt, scores, pred_phrases = get_grounding_output( model, image, tags, box_threshold, text_threshold, device=device )
print(f"Before NMS: {boxes_filt.shape[0]} boxes") -->
get 0 boxes
nms_idx = torchvision.ops.nms(boxes_filt, scores, iou_threshold).numpy().tolist() boxes_filt = boxes_filt[nms_idx] pred_phrases = [pred_phrases[idx] for idx in nms_idx] print(f"After NMS: {boxes_filt.shape[0]} boxes") -->get 0 boxes
----------- more information ------------------------- there are many
-inf
in the output of grounding model:with torch.no_grad(): outputs = model(image[None], captions=[caption])
wherecaption
is'building, person, illuminate, man, neon light, night, night view, red, retail, sign, signage, store, storefront, writing.'
outputs["pred_logits"].cpu() tensor([[[-4.1339, -3.2799, -4.7899, ..., -inf, -inf, -inf], [-4.1563, -3.3208, -4.8095, ..., -inf, -inf, -inf], [-4.1277, -3.3192, -4.7960, ..., -inf, -inf, -inf], ..., [-4.2636, -3.5951, -4.8640, ..., -inf, -inf, -inf], [-4.2648, -3.5952, -4.8737, ..., -inf, -inf, -inf], [-4.3287, -3.7804, -4.9183, ..., -inf, -inf, -inf]]])
outputs["pred_logits"].cpu().sigmoid()[0] tensor([[0.0158, 0.0363, 0.0082, ..., 0.0000, 0.0000, 0.0000], [0.0154, 0.0349, 0.0081, ..., 0.0000, 0.0000, 0.0000], [0.0159, 0.0349, 0.0082, ..., 0.0000, 0.0000, 0.0000], ..., [0.0139, 0.0267, 0.0077, ..., 0.0000, 0.0000, 0.0000], [0.0139, 0.0267, 0.0076, ..., 0.0000, 0.0000, 0.0000], [0.0130, 0.0223, 0.0073, ..., 0.0000, 0.0000, 0.0000]])
outputs["pred_boxes"].cpu() tensor([[[0.4588, 0.6766, 0.0010, 0.0010], [0.5302, 0.6559, 0.0010, 0.0010], [0.5568, 0.6461, 0.0010, 0.0010], ..., [0.8817, 0.3877, 0.0010, 0.0010], [0.1955, 0.2195, 0.0010, 0.0010], [0.2651, 0.5368, 0.0010, 0.0010]]])
--------------- There are models that i used: -------------------------- parser.add_argument( "--ram_checkpoint", type=str, default="./models/ram_swin_large_14m.pth" , help="path to checkpoint file" ) parser.add_argument( "--grounded_checkpoint", default= "./models/groundingdino_swint_ogc.pth", type=str, help="path to checkpoint file" ) parser.add_argument( "--sam_checkpoint", default="./models/sam_vit_h_4b8939.pth" ,type=str, help="path to checkpoint file" ) parser.add_argument( "--sam_hq_checkpoint", type=str, default=None, help="path to sam-hq checkpoint file" ) parser.add_argument( "--use_sam_hq", default=False, action="store_true", help="using sam-hq for prediction" )