Convolution dimensions were messed up due to no shared layer.
This push adds a single shared cnn_block for mapping embedding dimension to cnn dimension. (Could have used linear, but sticking to the other baselines).
Added to optimizers via self.other_components({"shared":{lr:self.learning_rate}, ...}) (thanks to IMN implementation using the same).
We also allow for configurable kernel size, to have more tunable params for RACL.
Convolution dimensions were messed up due to no shared layer.
This push adds a single shared
cnn_block
for mapping embedding dimension to cnn dimension. (Could have used linear, but sticking to the other baselines).Added to optimizers via
self.other_components({"shared":{lr:self.learning_rate}, ...})
(thanks to IMN implementation using the same).We also allow for configurable kernel size, to have more tunable params for RACL.