coremltools 6.2 has troubles to convert the layer of mvit_block_0_transformer_0_attention

apple / coremltools

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

https://coremltools.readme.io

BSD 3-Clause "New" or "Revised" License

4.4k stars 634 forks source link

coremltools 6.2 has troubles to convert the layer of mvit_block_0_transformer_0_attention #1796

Open cmsgiowa opened 1 year ago

cmsgiowa commented 1 year ago

❓Question

Layer 55: mvit_block_0_transformer_0_LN1 Input shape: (None, 4, 1024, 144) Output shape: (None, 4, 1024, 144) Input spec: None Layer 56: mvit_block_0_transformer_0_attention Input shape: (None, 4, 1024, 144) Output shape: (None, 4, 1024, 144) Input spec: None

lib/python3.7/site-packages/coremltools/converters/mil/mil/operation.py", line 318, in type_value_inference f'Core ML only supports tensors with rank <= 5. Layer "{self.name}", ' ValueError: Core ML only supports tensors with rank <= 5. Layer "reshape_16", with type "reshape", outputs a rank 6 tensor.

Any ideas?

Thanks a lot.

junpeiz commented 1 year ago

Hi @cmsgiowa, It's because in your model that layer has a tensor with rank 6, which is not supported by Core ML Framework. Could you modify your model to avoid that high-rank tensor? For example, you may try to add a reshape to merge several dimensions and reshape it back later.

Those Layer 55 and Layer 56 should not be the issue because they have rank 4.

cmsgiowa commented 1 year ago

Hi @junpeiz,

Thanks for your information. I still don't understand the error message with "outputs a rank 6 tensor". Let me put this way.

There are no errors if I convert the partial network.

layer_name = 'mvit_block_0_transformer_0_LN1'. // Layer 55 model_new= Model(inputs=model.input, outputs=model.get_layer(layer_name).output) mlmodel = ct.convert(model_new, inputs=[ct.ImageType()])

There will be errors starting with Layer 56

layer_name = 'mvit_block_0_transformer_0_attention'. // Layer 56 model_new= Model(inputs=model.input, outputs=model.get_layer(layer_name).output) mlmodel = ct.convert(model_new, inputs=[ct.ImageType()])

in short, I don't know where the error of "rank 6" came from.

Best

junpeiz commented 1 year ago

With the information provided, I cannot tell where the "rank 6" comes from. Could you provide a minimal code snippet that can re-produce this issue? Thanks!

cmsgiowa commented 1 year ago

Thanks again @junpeiz! here is the definition of transformer block where the two Layers mentioned above, if that helps

x1 = LayerNormalization(epsilon=1e-6, name=prefix+'_LN1')(x)
attention_output = MultiHeadAttention(num_heads=num_heads,
                                      key_dim=projection_dim,
                                      dropout=dropout,
                                      name=prefix+'_attention')(x1, x1)
x2 = Add(name=prefix+'_add1')([attention_output, x])
x3 = LayerNormalization(epsilon=1e-6, name=prefix+'_LN2')(x2)
x3 = feedforward(x3, hidden_units=[x.shape[-1] * 2, x.shape[-1]],
                 dropout_rate=dropout,
                 name=prefix+'_ff')
x = Add(name=prefix+'_add2')([x3, x2])

TobyRoseman commented 1 year ago

@cmsgiowa - this is not a minimal example. I can not even run this code; many things are undefined. I suggest you update your model to print the output shape after layer. Then you should be able to see when a rank 6 Tensor is being produced.