apple / coremltools

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
https://coremltools.readme.io
BSD 3-Clause "New" or "Revised" License
4.36k stars 630 forks source link

Allow variable (computed) weights in convolution #853

Open praeclarum opened 4 years ago

praeclarum commented 4 years ago

Description

I would like to be able to use Conv2d and Conv2dTranspose with variable weights. Currently, I get this error:

Input 'weight' of op 'Gs_1/G_synthesis/8x8/Conv0_up/conv2d_transpose' (conv_transpose) must be const at compile time.

when trying to convert StyleGAN 2.

Use cases

This is needed in modern GANs where classes and other embeddings are used to change the statistics of the weights of convolution. In StyleGAN 2, this is used to implement "Weight Demodulation".

The revised architecture enables us to replace instance normalization with a “demodulation” operation, which we apply to the weights associated with each convolution layer.

Analyzing and Improving the Image Quality of StyleGAN

Screen Shot 2020-08-13 at 5 32 41 PM

They removed instance normalization in favor of this technique. (Instance normalization was causing quality issues.)

Describe alternatives you've considered

sailor002 commented 3 years ago

Have you found a solution?

HorusAlkebulan commented 3 years ago

Also struggling with this, converting StyleGAN2 and variants from PyTorch to CoreML has had roadblock after roadblock and this is one of them we're seeing as well.

vogoriachko commented 2 years ago

@praeclarum have you solved it?

kuprel commented 2 years ago

I'm also getting this error with the latest coremltools version 5.1 when trying to convert the generator network from stylegan2-ada-pytorch

ValueError: ('Op "x.23" (op_type: conv_transpose) Input weight must be const at compile time', 'weight', 'w.35')

kuprel commented 2 years ago

I think I've narrowed down the problem. The error "Input weight must be const" only occurs for conv_transpose layers and not conv layers. Looking at the MIL ops documentation,conv_transpose requires the weight argument to be constant and conv does not [1]. I think changing the implementation of conv_transpose to accept a non-constant weight argument like conv does would solve the problem here.

[1] https://apple.github.io/coremltools/source/coremltools.converters.mil.mil.ops.defs.html#module-coremltools.converters.mil.mil.ops.defs.conv

john7002 commented 1 year ago

Hello,

Any news on this feature request? @kuprel did you manage to find a way to "rewrite" conv_transpose so that it can be used in coreml ? thanks

kuprel commented 1 year ago

This worked for me. Still uses conv_transpose but with constant data

import torch
from torch.nn import functional

def conv_transpose_stride2(x: torch.Tensor, w: torch.Tensor) -> torch.Tensor:
    dilate = torch.nn.ConvTranspose2d(in_channels=128, out_channels=128, kernel_size=1, stride=2, groups=128, bias=False)
    dilate.weight.data = torch.ones([128, 1, 1, 1])
    pad = torch.nn.ZeroPad2d([1, 1, 1, 1])
    return functional.conv2d(dilate(pad(x)), w.transpose(0, 1).flip(2, 3))

if __name__ == "__main__":
    torch.manual_seed(0)
    x = torch.randn([1, 128, 256, 256])
    w = torch.randn([128, 64, 3, 3])
    y = functional.conv_transpose2d(x, w, stride=2)
    y_ = conv_transpose_stride2(x, w)
    size = torch.tensor(y.shape).prod()
    with torch.no_grad():
        print((y - y_).square().mean().numpy(), y_.square().mean().numpy(), y.square().mean().numpy())
john7002 commented 1 year ago

Awesome! thanks

john7002 commented 1 year ago

Awesome! thanks

RahulBhalley commented 1 year ago

Hi @kuprel,

I tried your implementation @kuprel but I get TracerWarning: Trace had nondeterministic nodes. error:

# After your code.
import coremltools as ct
traced_model = torch.jit.trace(conv_transpose_stride2, x)

This gives the following error:

/opt/homebrew/lib/python3.9/site-packages/torch/jit/_trace.py:828: TracerWarning: Trace had nondeterministic nodes. Did you forget call .eval() on your model? Nodes:
    %60 : Float(128, 1, 1, 1, strides=[1, 1, 1, 1], requires_grad=1, device=cpu) = aten::uniform_(%tensor, %57, %58, %59) # /opt/homebrew/lib/python3.9/site-packages/torch/nn/init.py:412:0
This may cause errors in trace checking. To disable trace checking, pass check_trace=False to torch.jit.trace()
  _check_trace(
/opt/homebrew/lib/python3.9/site-packages/torch/jit/_trace.py:828: TracerWarning: Output nr 1. of the traced function does not match the corresponding output of the Python function. Detailed error:
Tensor-likes are not close!

Mismatched elements: 16842766 / 16842816 (100.0%)
Greatest absolute difference: 134.11310195922852 at index (0, 23, 324, 138) (up to 1e-05 allowed)
Greatest relative difference: 15173386.579398053 at index (0, 14, 291, 143) (up to 1e-05 allowed)
  _check_trace(

Did you get this error? Did you solve it?

Best, Rahul Bhalley

TobyRoseman commented 1 year ago

Can someone share a minimal example (i.e. a toy network which fails conversion because of variable weight convolution)?

RahulBhalley commented 1 year ago

Following is a code snippet to reproduce the error @TobyRoseman.

Now come on Apple (@TobyRoseman), please help me with this issue https://github.com/apple/coremltools/issues/1723#issuecomment-1381527231, please! I want StyleGAN2 on my iPhone!! We developers can't wait longer. This is extremely important, DL research is moving too fast & CoreML is kind of lacking behind (another case is the FFT ops).

import torch
from torch import nn
from torch.nn import functional as F

class Model(nn.Module):
  def __init__(self) -> None:
    super().__init__()
  def forward(self, x, w):
    return F.conv_transpose2d(x, w, padding=0, stride=2, groups=1)

x = torch.randn(1, 128, 128, 128)
w = torch.randn([128, 64, 3, 3])

model = Model().eval()
traced_model = torch.jit.trace(model, (x, w))

import coremltools as ct

inputs = [ct.TensorType('x', x.shape), ct.TensorType('w', w.shape)]
mlmodel = ct.convert(traced_model, inputs=inputs)

Then BOOM! You get an error:

ValueError: ('Op "22" (op_type: conv_transpose) Input weight must be const at compile time', 'weight', 'w')
ConradoMateu commented 1 year ago

Is there any update regarding this? I am facing the same problem I am not able to get rid of this error:

Tuple detected at graph output. This will be flattened in the converted model.
Converting PyTorch Frontend ==> MIL Ops:  84%|██████████████████████████████▏     | 518/619 [00:00<00:00, 3731.02 ops/s]
Error during Core ML conversion: ('Op "706" (op_type: conv_transpose) Input weight must be const at compile time', 'weight', 'wi_center')

Could you suggest something please @TobyRoseman

Here is my implementation:

import torch
import coremltools as ct
import torch.nn.functional as F

pretrained = "pretrained/states_pt_places2.pth"
generator_state_dict = torch.load(pretrained, map_location=torch.device('cpu'))['G']

if 'stage1.conv1.conv.weight' in generator_state_dict.keys():
    from model.networks import Generator
else:
    from model.networks_tf import Generator  

# Set up the network
generator = Generator(cnum_in=5, cnum=48, return_flow=False)
generator.load_state_dict(generator_state_dict, strict=True)

img = torch.rand([1, 5, 512, 512]).cpu()  # set image shape to 1024x1024
mask = torch.rand([1, 1, 512, 512]).cpu()

generator.cpu().eval()

# Use JIT to compile the PyTorch model to TorchScript
example_inputs = (torch.rand(1, 5, 512, 512), torch.rand(1, 1, 512, 512))

traced_model = torch.jit.trace(generator, example_inputs)
# Create the Core ML input and output types
input_type = ct.ImageType(name="input", shape=img.shape, color_layout="RGB")
mask_type = ct.TensorType(name="mask", shape=mask.shape)
output_type = ct.ImageType(name="output", color_layout="RGB")

# Convert the TorchScript model to Core ML
try:
    # Convert the TorchScript model to Core ML
    coreml_model = ct.convert(
        traced_model,
        inputs=[input_type, mask_type],
        outputs=[output_type],
        debug=True
    )

    # Save the Core ML model to a file
    coreml_model.save("output.mlmodel")

    print(f'Successfully exported Core ML model')

except Exception as e:
    print(f"Error during Core ML conversion: {e}")

I am trying to migrate this model: https://github.com/nipponjo/deepfillv2-pytorch

cc: @RahulBhalley @kuprel

Thanks in advance

TobyRoseman commented 1 year ago

The issue here is that the conv_transpose MIL op requires its weight parameter be a constant tensor. This is not something which can be fix in the Coremltools repository. This would require a change to the Core ML Framework.

Please submit this Core ML Framework issue using the Feedback Assistant. Once you have done that please the id value you get. The id value should start with "FB" followed by seven digits.

RahulBhalley commented 1 year ago

I don't understand. Why can't CoreML Tools team ask CoreML team to fix this issue? No diss, but I think the current team probably doesn't take this work seriously (this issue is 3 years old). It's really affecting us! Please understand that marketing and sales are really hard enough for us indie developers. At least make this development a breeze for us.

RahulBhalley commented 10 months ago

Any updates or not fixing this??? @TobyRoseman

TobyRoseman commented 10 months ago

@RahulBhalley - did you submit this issue via the Feedback Assistant as I suggested? If so, do you have a Feedback ID?

ykk648 commented 8 months ago

Any updates?