Wrong masks for 512 512 image

TOOFACK commented 5 months ago

Hello, I use example from https://github.com/facebookresearch/segment-anything/blob/main/notebooks/predictor_example.ipynb

But before I resized image to (512, 512) So here code snippet to reproduce:

from predictor import SamPredictor
sam = sam_model_registry['vit_h'](checkpoint='/weights/sam/sam_vit_h_4b8939(1).pth', custom_img_size=512).cuda()
predictor = SamPredictor(sam)

image = cv2.imread('/storage/share/paul/sam_onnx_export/truck.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (512, 512), 
               interpolation = cv2.INTER_LINEAR)

predictor.set_image(image)

input_point = np.array([[150, 150]])
input_label = np.array([1])

masks, scores, logits = predictor.predict(
    point_coords=input_point,
    point_labels=input_label,
    multimask_output=True,
)

for i, (mask, score) in enumerate(zip(masks, scores)):
    plt.figure(figsize=(10,10))
    plt.imshow(image)
    show_mask(mask, plt.gca())
    show_points(input_point, input_label, plt.gca())
    plt.title(f"Mask {i+1}, Score: {score:.3f}", fontsize=18)
    plt.axis('off')
    plt.show()

So when I run this code I get images like this:

Screenshot from 2024-01-30 12-17-11

You can see that point is located inside window, but for some reasons segmentation goes wrong. What am I doing wrong?

ByungKwanLee commented 5 months ago

The code in the official Meta SAM and my code is somewhat different and it will work when you run my example code

TOOFACK commented 5 months ago

Hm, I see, so there is only way to work with different img_size, only using sam.individual_forward, correct?

ByungKwanLee commented 5 months ago

Yes correct! it would be better for you to modify my code to fit your purpose!

ByungKwanLee / Full-Segment-Anything

Wrong masks for 512 512 image #4