Open Volutionn opened 1 year ago
Thank you for filing this feature request!
Could you provide a minimum code snippet that contains deform_conv2d to reproduce the issue? Thanks!
Meanwhile, I would also recommend to add the support of this op on your end by using the composite operators: https://coremltools.readme.io/docs/composite-operators
Thanks!
Thank you for your reply!
As requested, here's a minimum code snippet that contains deform_conv2d:
import torch
from torchvision.ops import deform_conv2d
import coremltools as ct
class DeformConv2DModel(torch.nn.Module):
def __init__(self):
super(DeformConv2DModel, self).__init__()
self.kh, self.kw = 3, 3
self.weight = torch.nn.Parameter(torch.rand(5, 3, self.kh, self.kw))
def forward(self, x, offset, mask):
out = deform_conv2d(x, offset, self.weight, mask=mask)
return out
# Define the model
model = DeformConv2DModel()
# Create a random input tensor
input_tensor = torch.rand(4, 3, 10, 10)
offset = torch.rand(4, 2 * model.kh * model.kw, input_tensor.shape[2] - 2, input_tensor.shape[3] - 2)
mask = torch.rand(4, model.kh * model.kw, input_tensor.shape[2] - 2, input_tensor.shape[3] - 2)
# Trace the model
traced_model = torch.jit.trace(model, (input_tensor, offset, mask))
# Convert to Core ML
coreml_model = ct.convert(
traced_model,
inputs=[ct.TensorType(name="input", shape=input_tensor.shape),
ct.TensorType(name="offset", shape=offset.shape),
ct.TensorType(name="mask", shape=mask.shape)],
source='pytorch',
)
Hello, I was wondering if there's any update regarding the support of the deform_conv2d operation? Thank you!
same demand for deform_conv2d here! any update?
+1
The PyTorch documentation for this op doesn't contains a lot of details. The PyTorch forward implementation for deform_conv2d
seems quite complex. Can someone share mathematical formulas for what this operation actually does?
Thank you @TobyRoseman for looking into this.
Based on my understanding of the topic, the PyTorch implementation references the following two papers: Deformable ConvNets Deformable ConvNets v2
I've summarized the formulas below.
Standard Convolution: $$y(p0) = \sum{p_n \in R} w(p_n) \cdot x(p_0 + p_n)$$ The base convolution operation without deformable adjustments.
Deformable Convolution: $$y(p0) = \sum{p_n \in R} w(p_n) \cdot x(p_0 + p_n + \Delta p_n)$$ Enhances standard convolution by adding learnable offsets $\Delta p_n$, adapting to geometric variations.
Bilinear Interpolation for Deformable Convolution: $$x(p) = \sum_{q} G(q,p) \cdot x(q)$$ Here, $G(q, p) = g(q_x, p_x) \cdot g(q_y, p_y)$ and $g(a,b) = \max(0,1-|a-b|)$ for sampling at non-integer locations.
Modulated Deformable Convolution: $$y(p) = \sum_{k=1}^{K} w_k \cdot x(p + p_k + \Delta p_k) \cdot \Delta m_k$$ Introduces modulation scalars $\Delta m_k$ with learnable offsets, refining the influence of each sampled location.
Modulated Deformable RoI Pooling: $$y(k) = \frac{1}{nk} \sum{j=1}^{nk} x(p{kj} + \Delta p_k) \cdot \Delta m_k$$ Applies modulation and learnable offsets to RoI pooling for enhanced precision.
Thanks @Volutionn for the concise information. So the deform_conv2d
PyTorch op uses just the "Deformable Convolution" formula, is that correct?
That's what I understand about the deform_conv2d
operation in PyTorch: it supports both Deformable Convolution versions 1 and 2, including the modulated formulas. When the mask
parameter is None
, it performs Deformable Convolution as described in the first Deformable ConvNets paper, utilizing only the "Deformable Convolution" formula. If a mask
is provided, it implements Deformable ConvNets v2, which incorporates the modulation formulas. Regarding bilinear interpolation, it is essential in both versions to manage fractional offsets.
Hello everyone, I tried using deform_conv2d in coremltools v7.1, and got this error:
RuntimeError: PyTorch convert function for op 'torchvision::deform_conv2d' not implemented.
Are there any workarounds? I don't think it's a simple implementation that can be done using a custom layer in MIL Builder. However, in case anyone has some ideas I'm happy to work on it to try and implement. Just need a starting point.
The reason for this is Deform Conv 2d is usually a drop - in replacement that improves loss by 20-30% on the CNN. It's pretty amazing, and would be of great advantage to have supported in deployed CoreML
Just a bump. Is this issue dead? Please advise.
@bitanath I imagine it's just a complex operator to add. I tried to implement it using composite operators, but without any success. Hopefully, this hasn't been abandoned on @TobyRoseman's side; I've been hoping for it for almost a year. Agree, it would be amazing to have it. Let's be patient, it's normal that it takes time.
@Volutionn I implemented the deform_conv2d
operation using CoreML custom layers:
https://github.com/dneprDroid/DeformConv2d-Metal
It's GPU-accelerated and supports both iOS and macOS. You can try the demo app and the example converter script to generate a CoreML model with custom layers.
Wow, you made my day! That's amazing, thanks a lot for sharing it @dneprDroid 🙏🏻
Thanks a lot @dneprDroid ! This is pretty neat! I was also trying to implement this in metal from the published CUDA implementation, but it seemed too hard.
Would you also be releasing the code for the shaders, purely for learning purposes? Or did I miss them somewhere.
Regardless, thanks a lot for this library! It's awesome!! Much appreciated ❤️
Name of layer type: deform_conv2d
Is this a PyTorch or a TensorFlow layer type: PyTorch
Your version of coremltools: 7.0b1
Your version of PyTorch/TensorFlow: PyTorch 2.0.1
Impact of supporting this layer type. Why is adding support for this layer type important? Is it necessary to support a popular model or use case? Deformable Convolution, as implemented in the torchvision.ops.deform_conv2d operator in PyTorch, is a key technique that allows Convolutional Neural Networks to adapt to complex spatial transformations in input data. It enhances the model's performance in tasks that require understanding spatial hierarchies and relationships, such as object detection, image segmentation, and image restoration. The lack of support for this operation presents a challenge for the conversion of my model.
I was wondering if there are any plans to implement support for the deform_conv2d operation in a future release of CoreML? If support for deform_conv2d is not planned, could you provide any advice or workarounds for dealing with this issue? Any guidance would be greatly appreciated.
Thank you for your time and for the excellent work you do on the CoreML project!