Open Typiqally opened 1 year ago
I'm not sure whether this is the actual solution since I'm still getting mixed results, but if I apply the following patch then the offset on the masks seems to dissipate:
diff --git a/mmdeploy/backend/coreml/ops.py b/mmdeploy/backend/coreml/ops.py
index 0af1aa42..da54307d 100644
--- a/mmdeploy/backend/coreml/ops.py
+++ b/mmdeploy/backend/coreml/ops.py
@@ -77,7 +77,7 @@ def roi_align(context, node):
normalized_coordinates=False,
spatial_scale=extrapolation_value,
box_coordinate_mode='CORNERS_WIDTH_FIRST',
- sampling_mode='OFFSET_CORNERS',
+ sampling_mode='DEFAULT',
)
# CoreML output format: [N, 1, C, h_out, w_out]
However, now the mask is lacking information on the edges:
Hi, could you provide the original image for test?
To save some time, here is the comparison of the masks between TorchScript and Core ML converted models, using the same model as before with the visualizer from this repository:
TorchScript
Core ML
You can see that in the Core ML version, the mask is slightly offset to the top left corner, when compared to the TorchScript version.
Note: it seems like the entire bounding box is offset, which causes the mask to also be offset.
Currently, there is no roi_align
op for coreml, we use crop_resize
op to extract roi feature instead of roi_align
, which I don't think is mathematically equal.
The aligned
parameter of roi_align
is set to true by default, which will add -0.5 offset to the start of roi. You could temporarily change it like below. The result will be a little better.
And I will ask my colleagues if there is a better way.
diff --git a/mmdeploy/backend/coreml/ops.py b/mmdeploy/backend/coreml/ops.py
index 0af1aa42..36ad1796 100644
--- a/mmdeploy/backend/coreml/ops.py
+++ b/mmdeploy/backend/coreml/ops.py
@@ -51,14 +51,21 @@ def roi_align(context, node):
const_box_info = False
extrapolation_value = context[node.inputs[2]].val
+ aligned = inputs[6].val
# CoreML index information along with boxes
if const_box_info:
boxes = context[node.inputs[1]].val
# CoreML expects boxes/ROI in
# [N, 1, 5, 1, 1] format
+ if aligned:
+ boxes[:, 1:] -= 0.5 / extrapolation_value
boxes = boxes.reshape(boxes.shape[0], 1, boxes.shape[1], 1, 1)
else:
boxes = inputs[1]
+ if aligned:
+ ind, boxes = mb.split(x=boxes, split_sizes=[1, 4], axis=1)
+ boxes = mb.sub(x=boxes, y=0.5 / extrapolation_value)
+ boxes = mb.concat(values=[ind, boxes], axis=1)
boxes = mb.reshape(
x=boxes, shape=[boxes.shape[0], 1, boxes.shape[1], 1, 1])
# Get Height and Width of crop
Alright, this seems to work as a temporary fix, thank you very much! I'll keep this issue open just in case you find an improved solution for this. If it is not possible due to Core ML incompatibility, feel free to close it.
Checklist
Describe the bug
I converted Mask R-CNN, to Core ML. The model conversion completes successfully and the model is working as expected. However, after conversion, the model only returns masks with the size of 28×28 (specified in the model config), instead of the default pre-processed masks which are resized to the original image. Currently, I'm rescaling the mask to fit into the bounding box width and height and filling up the gaps with bilinear-interpolation. If I visualize the masks, there seems to be a slight offset (see right and bottom sides).
Someone told me that this might have something to do with padding task in the pre-processing pipeline and that the original image must be a multiple of 32. I tried using an image size of 800×800, which is the result of 32·25, which still has this issue. In fact, the image above is 800×800.
Reproduction
out_dict = model.predict({'img_1': img})
out_dict = model.predict({'img_1': img})
detections = out_dict['detections'][0] labels = out_dict['labels'][0] masks = out_dict['masks'][0]
print(masks)
detections
Environment
Error traceback
No response