apple / coremltools

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
https://coremltools.readme.io
BSD 3-Clause "New" or "Revised" License
4.33k stars 626 forks source link

About Image Input and Output. #2213

Open jhl13 opened 4 months ago

jhl13 commented 4 months ago

I have a few questions and concerns. I have a denoising model where I preprocess the input by dividing it by 255 and postprocess the output by multiplying it by 255. However, when I use image input and output, I encounter the following issues:

  1. When I useinput = ct.ImageType(name='input', shape=(1, 3, 1080, 1920), color_layout=ct.colorlayout.RGB, scale=1/255.) as the input conversion for the model, it inserts a mul node, but this node performs calculations in fp32 which is very slow. Is there a way to force the scale node to use fp16 calculations? Additionally, because subsequent convolution operations default to using fp16, it further increases the need to add a cast operator to convert the fp32 output of the mul operator to fp16 output.
  2. output = ct.ImageType(name='output', color_layout=ct.colorlayout.RGB), the scale must be set to 1.0, which is very inconvenient to use and requires additional post-processing.

Is there a way to solve these issues?

image

TobyRoseman commented 4 months ago

1 - Try passing compute_precision=coremltools.precision.FLOAT16 to coremltools.convert.

2 - I don't understand the issue here. Do you want your model accept images or different sizes?