dbolya / yolact

A simple, fully convolutional model for real-time instance segmentation.
MIT License
5k stars 1.33k forks source link

how to test on a single image and get masks? #39

Closed sporterman closed 5 years ago

dbolya commented 5 years ago

From the readme:

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --image=my_image.png

Or do you want to export the masks as a numpy array or something?

sporterman commented 5 years ago

@dbolya Thanks, i want to get person's masks in picture, numpy array format

dbolya commented 5 years ago

On this line: https://github.com/dbolya/yolact/blob/a70b68dd70aac5a1f41789771a66fb33adba2809/eval.py#L150

Add

np.save('masks.npy', masks.cpu().numpy())

The masks will be of size num_detections x im_h x im_w. If you want to export the class indices too, do the same with classes but don't include .cpu().numpy() (it's already a numpy array at that point; classes is a vector of size num_detections). Then use the above command and it'll call this function. If you don't want to still display the image, you can call exit() after you add that line.

If you want a cleaner solution, lmk. This is kind of hacky.

sporterman commented 5 years ago

Much appreciate, Thasnks!

majinshaoyuindustry commented 4 years ago

On this line: https://github.com/dbolya/yolact/blob/a70b68dd70aac5a1f41789771a66fb33adba2809/eval.py#L150

Add

np.save('masks.npy', masks.cpu().numpy())

The masks will be of size num_detections x im_h x im_w. If you want to export the class indices too, do the same with classes but don't include .cpu().numpy() (it's already a numpy array at that point; classes is a vector of size num_detections). Then use the above command and it'll call this function. If you don't want to still display the image, you can call exit() after you add that line.

If you want a cleaner solution, lmk. This is kind of hacky.

Please provide a cleaner solution if possible. I am trying to streaming the mask (especially humans) into programs such as Touchdesigner and be able to manipulate that (ie. invisible shader in glsl). I tried the line of code above and use a simple function from pyplot to view it as image but failed:

import numpy as np
from matplotlib import pyplot as plt
img_array = np.load("masks.npy")
plt.imshow(img_array, cmap="gray")
plt.show()

TypeError: Invalid shape (1, 480, 640) for image data

Any suggestions please?

Thank you in advance!

dbolya commented 4 years ago

@majinshaoyuindustry matplotlib expects either a 2d image or a 3d image with channels at the end. In your case just use plt.imshow(img_array[0], cmap="gray") instead.

And yeah, I agree these hacks are not very pretty. We're working on a cleaner API (#323)

majinshaoyuindustry commented 4 years ago

@majinshaoyuindustry matplotlib expects either a 2d image or a 3d image with channels at the end. In your case just use plt.imshow(img_array[0], cmap="gray") instead.

And yeah, I agree these hacks are not very pretty. We're working on a cleaner API (#323)

Thanks that worked out pretty well. One more question: if I want to export this mask in real-time with a webcam, is there an easy way to create a buffer? Like it will only save 30frames in the folder with the later frames override the earlier ones.

dbolya commented 4 years ago

@majinshaoyuindustry The webcam part is already implement (see the Readme for how to run off a webcam), but you'll have to implement the frame buffer part yourself.

Here's where I save the frames: https://github.com/dbolya/yolact/blob/f54b0a5b17a7c547e92c4d7026be6542f43862e7/eval.py#L739

So you can just replace that line with something special for your buffer.