apple / coremltools

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
https://coremltools.readme.io
BSD 3-Clause "New" or "Revised" License
4.3k stars 625 forks source link

Support for deform_conv2d operation from PyTorch #1889

Open Volutionn opened 1 year ago

Volutionn commented 1 year ago

I was wondering if there are any plans to implement support for the deform_conv2d operation in a future release of CoreML? If support for deform_conv2d is not planned, could you provide any advice or workarounds for dealing with this issue? Any guidance would be greatly appreciated.

Thank you for your time and for the excellent work you do on the CoreML project!

junpeiz commented 1 year ago

Thank you for filing this feature request!

Could you provide a minimum code snippet that contains deform_conv2d to reproduce the issue? Thanks!

Meanwhile, I would also recommend to add the support of this op on your end by using the composite operators: https://coremltools.readme.io/docs/composite-operators

Thanks!

Volutionn commented 1 year ago

Thank you for your reply!

As requested, here's a minimum code snippet that contains deform_conv2d:

import torch
from torchvision.ops import deform_conv2d
import coremltools as ct

class DeformConv2DModel(torch.nn.Module):
    def __init__(self):
        super(DeformConv2DModel, self).__init__()
        self.kh, self.kw = 3, 3
        self.weight = torch.nn.Parameter(torch.rand(5, 3, self.kh, self.kw))

    def forward(self, x, offset, mask):
        out = deform_conv2d(x, offset, self.weight, mask=mask)
        return out

# Define the model
model = DeformConv2DModel()

# Create a random input tensor
input_tensor = torch.rand(4, 3, 10, 10)
offset = torch.rand(4, 2 * model.kh * model.kw, input_tensor.shape[2] - 2, input_tensor.shape[3] - 2)
mask = torch.rand(4, model.kh * model.kw, input_tensor.shape[2] - 2, input_tensor.shape[3] - 2)

# Trace the model
traced_model = torch.jit.trace(model, (input_tensor, offset, mask))

# Convert to Core ML
coreml_model = ct.convert(
    traced_model,
    inputs=[ct.TensorType(name="input", shape=input_tensor.shape),
            ct.TensorType(name="offset", shape=offset.shape),
            ct.TensorType(name="mask", shape=mask.shape)],
    source='pytorch',
)
Volutionn commented 1 year ago

Hello, I was wondering if there's any update regarding the support of the deform_conv2d operation? Thank you!

Feynman1999 commented 10 months ago

same demand for deform_conv2d here! any update?

bitanath commented 9 months ago

+1

TobyRoseman commented 6 months ago

The PyTorch documentation for this op doesn't contains a lot of details. The PyTorch forward implementation for deform_conv2d seems quite complex. Can someone share mathematical formulas for what this operation actually does?

Volutionn commented 6 months ago

Thank you @TobyRoseman for looking into this.

Based on my understanding of the topic, the PyTorch implementation references the following two papers: Deformable ConvNets Deformable ConvNets v2

I've summarized the formulas below.

Deformable ConvNets:


Deformable ConvNets v2:

TobyRoseman commented 6 months ago

Thanks @Volutionn for the concise information. So the deform_conv2d PyTorch op uses just the "Deformable Convolution" formula, is that correct?

Volutionn commented 6 months ago

That's what I understand about the deform_conv2d operation in PyTorch: it supports both Deformable Convolution versions 1 and 2, including the modulated formulas. When the mask parameter is None, it performs Deformable Convolution as described in the first Deformable ConvNets paper, utilizing only the "Deformable Convolution" formula. If a mask is provided, it implements Deformable ConvNets v2, which incorporates the modulation formulas. Regarding bilinear interpolation, it is essential in both versions to manage fractional offsets.

bitanath commented 3 months ago

Hello everyone, I tried using deform_conv2d in coremltools v7.1, and got this error:

RuntimeError: PyTorch convert function for op 'torchvision::deform_conv2d' not implemented.

Are there any workarounds? I don't think it's a simple implementation that can be done using a custom layer in MIL Builder. However, in case anyone has some ideas I'm happy to work on it to try and implement. Just need a starting point.

The reason for this is Deform Conv 2d is usually a drop - in replacement that improves loss by 20-30% on the CNN. It's pretty amazing, and would be of great advantage to have supported in deployed CoreML

bitanath commented 3 months ago

Just a bump. Is this issue dead? Please advise.

Volutionn commented 3 months ago

@bitanath I imagine it's just a complex operator to add. I tried to implement it using composite operators, but without any success. Hopefully, this hasn't been abandoned on @TobyRoseman's side; I've been hoping for it for almost a year. Agree, it would be amazing to have it. Let's be patient, it's normal that it takes time.

dneprDroid commented 3 months ago

@Volutionn I implemented the deform_conv2d operation using CoreML custom layers:

https://github.com/dneprDroid/DeformConv2d-Metal

It's GPU-accelerated and supports both iOS and macOS. You can try the demo app and the example converter script to generate a CoreML model with custom layers.

Volutionn commented 3 months ago

Wow, you made my day! That's amazing, thanks a lot for sharing it @dneprDroid 🙏🏻

bitanath commented 3 months ago

Thanks a lot @dneprDroid ! This is pretty neat! I was also trying to implement this in metal from the published CUDA implementation, but it seemed too hard.

Would you also be releasing the code for the shaders, purely for learning purposes? Or did I miss them somewhere.

Regardless, thanks a lot for this library! It's awesome!! Much appreciated ❤️