Closed stevezkw1998 closed 7 months ago
Traceback (most recent call last): PyTorch version: 2.0.1 CUDA is available: True Loading model.. <All keys matched successfully> Loading model.. Done Start Inferencing.. type(input_boxes): <class 'numpy.ndarray'> File "/root/inference.py", line 106, in <module> input_boxes: [[ 308 298 243 601] [ 13 327 216 632] [ 557 144 187 1552] [ 174 286 142 459] [ 670 405 73 829] [ 785 194 88 160] [ 0 327 52 437] [ 754 218 43 131] [ 347 261 40 63] [ 33 251 71 96] [ 105 259 85 119] [ 0 194 48 141] [ 181 202 77 92]] output_results = { File "/root/inference.py", line 107, in <dictcomp> path: model.inference(path, predictions[path]) File "/root/inference.py", line 38, in inference masks, scores, logits = self.predictor.predict( File "/root/sam-hq/segment_anything/predictor.py", line 157, in predict masks, iou_predictions, low_res_masks = self.predict_torch( File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/root/sam-hq/segment_anything/predictor.py", line 227, in predict_torch sparse_embeddings, dense_embeddings = self.model.prompt_encoder( File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/sam-hq/segment_anything/modeling/prompt_encoder.py", line 159, in forward sparse_embeddings = torch.cat([sparse_embeddings, box_embeddings], dim=1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 1 but got size 13 for tensor number 1 in the list.
My codes show here:
image = cv2.imread(image_path) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) w, h, _ = image.shape bboxes = prediction["bboxes"] # x1,y1,x2,y2 bboxes = np.asarray([b[:4] for b in bboxes], dtype=np.float32) bboxes[:, (0, 2)] *= w bboxes[:, (1, 3)] *= h bboxes[:, 0:2] = np.ceil(bboxes[:, 0:2]) bboxes[:, 2:4] = np.floor(bboxes[:, 2:4]) bboxes = bboxes.astype(int) input_boxes = bboxes print("type(input_boxes):") print(type(input_boxes)) print("input_boxes:") print(input_boxes) # input_boxes = np.array([[4,13,1007,1023]]) input_point, input_label = None, None self.predictor.set_image(image) masks, scores, logits = self.predictor.predict( point_coords=input_point, point_labels=input_label, box=input_boxes, multimask_output=False, hq_token_only=False, )
I should use these code: https://github.com/SysCV/sam-hq/blob/main/demo/demo_hqsam.py#L129
My codes show here: