Closed Vaida12345 closed 1 year ago
I've run into the same issue. Any idea how to get this fixed?
Made some progress on this. For the output you need an activation layer for normalisation as mentioned here: https://github.com/john-rocky/CoreML-Models/issues/6 I am however still struggling with the input (may need also normalisation)
To anyone in the future: I have found the answer.
In one file:
import coremltools as ct
import torch
from basicsr.archs.rrdbnet_arch import RRDBNet
import coremltools.proto.FeatureTypes_pb2 as ft
example_input = torch.rand(1, 3, 256, 256)
example_output = torch.rand(1, 3, 1024, 1024)
model_path = "/Users/vaida/Downloads/Safari download/RealESRGAN_x4plus.pth"
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
loadnet = torch.load(model_path, map_location=torch.device('cpu'))
# prefer to use params_ema
if 'params_ema' in loadnet:
keyname = 'params_ema'
else:
keyname = 'params'
model.load_state_dict(loadnet[keyname], strict=True)
traced_model = torch.jit.trace(model, example_input)
image_input = ct.ImageType(name="input", shape=example_input.shape, scale=0.003921568859368563)
image_output = ct.ImageType(name="output", shape=example_output.shape)
mlmodel = ct.convert(
traced_model,
source = "pytorch",
inputs = [image_input]
)
mlmodel.save("original.mlpackage")
This file would create the translated model from PyTorch to CoreML.
Then, in another file, to make the output as an image
import coremltools as ct
import torch
import coremltools.proto.FeatureTypes_pb2 as ft
mlmodel = ct.models.MLModel("original.mlpackage")
spec = mlmodel.get_spec()
builder = ct.models.neural_network.NeuralNetworkBuilder(spec=spec)
builder.add_squeeze(name="squeeze", input_name="var_4053", output_name="squeeze_out", axes = None, squeeze_all = True)
builder.add_activation(name="activation",non_linearity="LINEAR",input_name="squeeze_out",output_name="image",params=[255, 0])
builder.spec.description.output.pop()
builder.spec.description.output.add()
output = builder.spec.description.output[0]
output.name = "image"
output.type.imageType.colorSpace = ft.ImageFeatureType.ColorSpace.Value('RGB')
output.type.imageType.width = 1024
output.type.imageType.height = 1024
mlmodel = ct.models.MLModel(spec)
mlmodel.save("newmodel.mlpackage")
Hey Vaida! Have you managed to convert other SR models besides this one? It works pretty slow in physical devices
No I haven’t. I suppose these models are slow for they are complicated. If you open these models in Netron, for example, they are much more complex than Waifu2x. Hence I think I would continue using the c++ implementation.
@Vaida12345 Does the c++ implementation for this work faster then MLModel and CoreML?
CoreML should be fastest anyway, as it is highly optimized by Apple, and it uses both GPU, CPU and Apple neural engine, while the c++ implementation uses only GPU. However, by testing the models on a physical machine, I found them nearly equally fast. Considering you need to deal with the alpha channels and post processing yourself, I would recommend the C++ implementation.
Hi: Thank you for the great work! Can you please provide the code for converting the RealESRGAN model to CoreML Model?
I tried to build by myself but apparently the output is wrong.