Open likesum opened 1 year ago
With the existence of Conv3D in coremltools
, support for 3D upsampling layers would be logical. This is much needed for medical image analysis, video analysis, and other volumetric applications. I tried to implement that myself in coremltools
, but I think that CoreML itself does not support 3D upsampling. I got stuck here:
/Users/laves/projects/coremltools/coremltools/models/model.py:154:
RuntimeWarning: You will not be able to run predict() on this Core ML model.
Underlying exception message was: Error compiling model: "Failed to parse the
model specification. Error: Unable to parse ML Program: in operation op_5_cast:
For operation of type 'upsample_nearest_neighbor' number of inputs must be
within the range (inclusive): 3 : 3. Provided 4".
One could hack at least upsample_nearest3d
for integer scales using mb.conv_transpose
with kernel size SxSxS
filled with 1 and strides S, S, S
, where S
is the scale factor:
@register_torch_op
def upsample_nearest3d(context, node):
inputs = _get_inputs(context, node, expected=3)
x = inputs[0]
s = inputs[2]
c = x.shape[1]
s_d, s_w, s_h = map(int, s.val)
x = mb.conv_transpose(
x=x,
weight=np.ones((c, 1, s_d, s_w, s_h)),
strides=[s_d, s_w, s_h],
groups=c,
name=node.name
)
context.add(x)
@mlaves - to request changes to the Core ML Framework, please use the Feedback Assistant.
trilinear
mode intorch.nn.functional.interpolate
,torch.nn.Upsample
.Example to reproduce:
Error message: