Keras CapsuleNetwork exported model fails to run.

teaglin commented 1 year ago

🐞Describing the bug

I am trying to convert a custom Keras layer that implements Capsule Network. The model exports but fails to load and run in Xcode.

To Reproduce

from keras.models import Model
from keras import Input
import numpy as np
import tensorflow as tf

import coremltools as ct

class Squash(tf.keras.layers.Layer):
    def __init__(self, eps=10e-21, **kwargs):
        super().__init__(**kwargs)
        self.eps = eps

    def call(self, s):
        n = tf.norm(s,axis=-1,keepdims=True)
        return (1 - 1/(tf.math.exp(n)+self.eps))*(s/(n+self.eps))

    def get_config(self):
        base_config = super().get_config()
        return {**base_config}

    def compute_output_shape(self, input_shape):
        return input_shape

class FCCaps(tf.keras.layers.Layer):
    """
    Fully-connected caps layer. It exploites the routing mechanism, explained in 'Efficient-CapsNet: Capsule Network with Self-Attention Routing', 
    to create a parent layer of capsules. 

    ...

    Attributes
    ----------
    N: int
        number of primary capsules
    D: int
        primary capsules dimension (number of properties)
    kernel_initilizer: str
        matrix W initialization strategy

    Methods
    -------
    call(inputs)
        compute the primary capsule layer
    """
    def __init__(self, N, D, kernel_initializer='he_normal', **kwargs):
        super(FCCaps, self).__init__(**kwargs)
        self.N = N
        self.D = D
        self.kernel_initializer = tf.keras.initializers.get(kernel_initializer)

    def build(self, input_shape):
        input_N = input_shape[-2]
        input_D = input_shape[-1]

        self.W = self.add_weight(shape=[self.N, input_N, input_D, self.D],initializer=self.kernel_initializer,name='W')
        self.b = self.add_weight(shape=[self.N, input_N,1], initializer=tf.zeros_initializer(), name='b')
        self.built = True

    def call(self, inputs, training=None):

        u = tf.einsum('...ji,kjiz->...kjz',inputs,self.W)    # u shape=(None,N,H*W*input_N,D)

        c = tf.einsum('...ij,...kj->...i', u, u)[...,None]        # b shape=(None,N,H*W*input_N,1) -> (None,j,i,1)
        c = c/tf.sqrt(tf.cast(self.D, tf.float32))
        c = tf.nn.softmax(c, axis=1)                             # c shape=(None,N,H*W*input_N,1) -> (None,j,i,1)
        c = c + self.b
        s = tf.reduce_sum(tf.multiply(u, c),axis=-2)             # s shape=(None,N,D)
        v = Squash()(s)       # v shape=(None,N,D)

        return v

    def compute_output_shape(self, input_shape):
        return (None, self.C, self.L)

    def get_config(self):
        config = {
            'N': self.N,
            'D': self.D
        }
        base_config = super(FCCaps, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

class PrimaryCaps(tf.keras.layers.Layer):
    """
    Create a primary capsule layer with the methodology described in 'Efficient-CapsNet: Capsule Network with Self-Attention Routing'. 
    Properties of each capsule s_n are exatracted using a 2D depthwise convolution.

    ...

    Attributes
    ----------
    F: int
        depthwise conv number of features
    K: int
        depthwise conv kernel dimension
    N: int
        number of primary capsules
    D: int
        primary capsules dimension (number of properties)
    s: int
        depthwise conv strides
    Methods
    -------
    call(inputs)
        compute the primary capsule layer
    """
    def __init__(self, F, K, N, D, s=1, **kwargs):
        super(PrimaryCaps, self).__init__(**kwargs)
        self.F = F
        self.K = K
        self.N = N
        self.D = D
        self.s = s

    def build(self, input_shape):    
        self.DW_Conv2D = tf.keras.layers.Conv2D(self.F, self.K, self.s,
                                             activation='linear', groups=self.F, padding='valid')

        self.built = True

    def call(self, inputs):      
        x = self.DW_Conv2D(inputs)      
        x = tf.keras.layers.Reshape((self.N, self.D))(x)
        x = Squash()(x)
        return x

    def get_config(self):
        config = {
            'F': self.F,
            'K': self.K,
            'N': self.N,
            'D': self.D,
            's': self.s
        }
        base_config = super(PrimaryCaps, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

inputs = Input((28,28,1))
x = tf.keras.layers.Conv2D(32,5,activation="relu", padding='valid', kernel_initializer='he_normal')(inputs)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Conv2D(64,3, activation='relu', padding='valid', kernel_initializer='he_normal')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Conv2D(64,3, activation='relu', padding='valid', kernel_initializer='he_normal')(x)   
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Conv2D(128,3,2, activation='relu', padding='valid', kernel_initializer='he_normal')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = PrimaryCaps(128, 9, 16, 8)(x)
x = FCCaps(10,16)(x)
m = Model(inputs=inputs, outputs=[x], name='CapsNet_Example')
q = m.predict(np.zeros((1,28,28,1), dtype=np.float32))

print(m.summary())
print(q)

coreml_model = ct.convert(
    m,
    compute_precision=ct.precision.FLOAT32,
    minimum_deployment_target=ct.target.iOS16,
    source='tensorflow'
)

Output

Model: "CapsNet_Example"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 28, 28, 1)]       0         

 conv2d (Conv2D)             (None, 24, 24, 32)        832       

 batch_normalization (BatchN  (None, 24, 24, 32)       128       
 ormalization)                                                   

 conv2d_1 (Conv2D)           (None, 22, 22, 64)        18496     

 batch_normalization_1 (Batc  (None, 22, 22, 64)       256       
 hNormalization)                                                 

 conv2d_2 (Conv2D)           (None, 20, 20, 64)        36928     

 batch_normalization_2 (Batc  (None, 20, 20, 64)       256       
 hNormalization)                                                 

 conv2d_3 (Conv2D)           (None, 9, 9, 128)         73856     

 batch_normalization_3 (Batc  (None, 9, 9, 128)        512       
 hNormalization)                                                 

 primary_caps (PrimaryCaps)  (None, 16, 8)             10496     

 fc_caps (FCCaps)            (None, 10, 16)            20640     

=================================================================
Total params: 162,400
Trainable params: 161,824
Non-trainable params: 576
_________________________________________________________________
None
[[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]]

Running TensorFlow Graph Passes:   0%|                                                                                        | 0/6 [00:00<?, ? passes/s]2023-01-29 16:34:03.570944: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
Running TensorFlow Graph Passes: 100%|████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 82.80 passes/s]
Converting TF Frontend ==> MIL Ops:  61%|██████████████████████████████████████████████▎                             | 56/92 [00:00<00:00, 4170.18 ops/s]
Traceback (most recent call last):
coremltools/converters/mil/frontend/tensorflow/convert_utils.py", line 190, in convert_graph
    raise NotImplementedError(msg)
NotImplementedError: Conversion for TF op 'PartitionedCall' not implemented.

name: "CapsNet_Example/primary_caps/conv2d/PartitionedCall"
op: "PartitionedCall"
input: "CapsNet_Example/batch_normalization_3/FusedBatchNormV3"
input: "CapsNet_Example/primary_caps/conv2d/ReadVariableOp"
attr {
  key: "Tin"
  value {
    list {
      type: DT_FLOAT
      type: DT_FLOAT
    }
  }
}
attr {
  key: "Tout"
  value {
    list {
      type: DT_FLOAT
    }
  }
}
attr {
  key: "_XlaMustCompile"
  value {
    b: true
  }
}
attr {
  key: "_collective_manager_ids"
  value {
    list {
    }
  }
}
attr {
  key: "_read_only_resource_inputs"
  value {
    list {
    }
  }
}
attr {
  key: "config"
  value {
    s: ""
  }
}
attr {
  key: "config_proto"
  value {
    s: "\n\007\n\003CPU\020\001\n\007\n\003GPU\020\0002\002J\0008\001\202\001\000"
  }
}
attr {
  key: "executor_type"
  value {
    s: ""
  }
}
attr {
  key: "f"
  value {
    func {
      name: "__inference__jit_compiled_convolution_op_278"
    }
  }
}

System environment (please complete the following information):

coremltools version 6.1
OS Test on MacOS 13.1 – Exported on Linux Ubuntu 22.04:
Tensorflow 2.10

TobyRoseman commented 1 year ago

@teaglin - we need more information here:

1 - When you say the exported models "fails to run", what do you mean? When does it fail? What error message do you get?

2 - In addition to the code for generating the Keras model, please also include the code you use to convert the Keras model to Core ML.

teaglin commented 1 year ago

@TobyRoseman

I updated the original post. I apologize the wrong Capsule Network implementation was posted – I have corrected it and included information for both your questions.

TobyRoseman commented 1 year ago

@teaglin - thanks for sharing the output information. It looks like we are missing support for a TensorFlow op you need. I'm still not able to reproduce this issue. Your code does not have any of the necessary import statements. Also on this line:

m = Model(inputs=inputs, outputs=[x], name='CapsNet_Example')

Model is not defined.

teaglin commented 1 year ago

@TobyRoseman missing import.

from keras.models import Model

TobyRoseman commented 1 year ago

@teaglin - sounds good, I just wanted to make sure it was not a locally defended model.

I can now reproduce the issue. I have updated the original code in the issue description.

apple / coremltools