mlech26l / ncps

PyTorch and TensorFlow implementation of NCP, LTC, and CfC wired neural models
https://www.nature.com/articles/s42256-020-00237-3
Apache License 2.0
1.86k stars 297 forks source link

Not an issue, but a suggestion. #41

Closed R-Liebert closed 1 year ago

R-Liebert commented 1 year ago

Hi Mathias! I'm a great fan of what you and Ramin, and are currently doing my master thesis in robotics based on the CfC. Playing around with the ncps package I noticed a pretty extreme improvment in training time using impala-cnn in the atari-tf example (BC). The convolution block I used was pretty simple:

class impalaConvLayer(tf.keras.layers.Layer): def init(self, filters, kernel_size, strides, padding='valid', use_bias=False): super(impalaConvLayer, self).init() self.conv = Conv2D( filters=filters, kernel_size=kernel_size, strides=strides, padding=padding, use_bias=use_bias, kernel_initializer=tf.keras.initializers.VarianceScaling(scale=2.0, mode='fan_out', distribution='truncated_normal') ) self.bn = BatchNormalization(momentum=0.99, epsilon=0.001) self.relu = ReLU()

@tf.function
def call(self, inputs):
    x = self.conv(inputs)
    x = self.bn(x)
    x = self.relu(x)
    return x

class ImpalaConvBlock(tf.keras.models.Sequential): def init(self): super(ImpalaConvBlock, self).init(layers=[ impalaConvLayer(filters=16, kernel_size=8, strides=4), impalaConvLayer(filters=32, kernel_size=4, strides=2), impalaConvLayer(filters=32, kernel_size=3, strides=1), Flatten(), Dense(units=256, activation='relu') ])

As I believe training time on weak computers often discourage students I think making the example run faster could be wise. What do you think about using impala-cnn before CfC? Is there something I've overlooked that makes this a bad idea?

Anyways, keep up the good work! What you've accomplished is really inspiring!

Robin

mlech26l commented 1 year ago

Thanks @R-Liebert for pointing this out. I have updated the docs highlighting this and changed the TF example