Not an issue, but a suggestion.

Hi Mathias! I'm a great fan of what you and Ramin, and are currently doing my master thesis in robotics based on the CfC. Playing around with the ncps package I noticed a pretty extreme improvment in training time using impala-cnn in the atari-tf example (BC). The convolution block I used was pretty simple:

class impalaConvLayer(tf.keras.layers.Layer): def init(self, filters, kernel_size, strides, padding='valid', use_bias=False): super(impalaConvLayer, self).init() self.conv = Conv2D( filters=filters, kernel_size=kernel_size, strides=strides, padding=padding, use_bias=use_bias, kernel_initializer=tf.keras.initializers.VarianceScaling(scale=2.0, mode='fan_out', distribution='truncated_normal') ) self.bn = BatchNormalization(momentum=0.99, epsilon=0.001) self.relu = ReLU()

@tf.function
def call(self, inputs):
    x = self.conv(inputs)
    x = self.bn(x)
    x = self.relu(x)
    return x

class ImpalaConvBlock(tf.keras.models.Sequential): def init(self): super(ImpalaConvBlock, self).init(layers=[ impalaConvLayer(filters=16, kernel_size=8, strides=4), impalaConvLayer(filters=32, kernel_size=4, strides=2), impalaConvLayer(filters=32, kernel_size=3, strides=1), Flatten(), Dense(units=256, activation='relu') ])

As I believe training time on weak computers often discourage students I think making the example run faster could be wise. What do you think about using impala-cnn before CfC? Is there something I've overlooked that makes this a bad idea?

Anyways, keep up the good work! What you've accomplished is really inspiring!

Robin

mlech26l / ncps

Not an issue, but a suggestion. #41