Closed hermda02 closed 3 months ago
Hello,
If you can reproduce it on one of functions already located in our repository, we will consider it as a bug.
Unfortunately I was not. If possible, I would like to change the label to a documentation issue.
This seems to be related to #6332 . Any chance you know in which piece of code the mask is parsed and applied to the input image?
I'm finding that an input block mask (mask[:500,:500] = 1
) returns a masked image that appears to suffer some type of anti-aliasing:
Be sure you are sending correct borders.
[0, 0, 500, 500] - will be incorrect in this case [0, 0, 499, 499] is correct
The borders are correct and are determined by torchvision.ops.mask_to_boxes
.
Docker log output shows:
Borders: [0, 0, 499, 499]
Mask: tensor([[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
...,
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0]])
Sum(mask): tensor(250000)
Edit: Corresponding code:
mask_out[:] = 0
mask_out[:500,:500] = 1
boxes = masks_to_boxes(mask_out.unsqueeze(0)).tolist()
box = [int(b) for b in boxes[0]]
print(box)
print(mask_out)
print(torch.sum(mask_out))
cvat_mask = mask_out.long().tolist() + box
prob = torch.nn.functional.softmax(output, dim=1).cpu().numpy()
results.append({
"type": "mask",
"confidence": str(prob),
"label": "Net",
"mask": cvat_mask,
})
return results
Solution was found by utilizing the to_cvat_mask
function found here:
https://github.com/cvat-ai/cvat/blob/develop/serverless/openvino/base/shared.py
and implemented here:
https://github.com/cvat-ai/cvat/blob/develop/serverless/openvino/omz/intel/semantic-segmentation-adas-0001/nuclio/model_handler.py
Actions before raising this issue
Steps to Reproduce
detector
function which produces a binary mask.results.append({ "type": "mask", "confidence": str(prob), "label": "Net", "mask": cvat_mask, })
where cvat_mask = binary_mask + [bounding box corners]Expected Behavior
Expect CVAT to return an annotated image with the input binary mask applied, like the attached image. (These are the same inference results that go into CVAT).
Possible Solution
Documentation concerning the format required by CVAT in order for the binary mask to be properly handled.
Context
I've created a serverless function with my own pre-trained UNet model which returns masks for a given image by using PyTorch. When running inference on both my own machine and using Nuclio I am able to retrieve the same results as far as model outputs and mask details go.
However, when sending the mask in to CVAT, it returns a messy image:
This appears to be an issue with going from a binary mask to the RLE solution (https://github.com/cvat-ai/cvat/blob/develop/cvat/apps/lambda_manager/views.py, lines 738-743).
Environment