Debug: RACL cnn_dim + allow kernel size tuning

Convolution dimensions were messed up due to no shared layer.

This push adds a single shared cnn_block for mapping embedding dimension to cnn dimension. (Could have used linear, but sticking to the other baselines).

Added to optimizers via self.other_components({"shared":{lr:self.learning_rate}, ...}) (thanks to IMN implementation using the same).

We also allow for configurable kernel size, to have more tunable params for RACL.

pmhalvor / fgsa

Debug: RACL cnn_dim + allow kernel size tuning #49