Reworking of MACE for cleanliness, robustness to hyperparameter configurations, and numerical stability in decoders.
Main Changes:
(1) Many changes to cleanliness and comments, moving some blocks to the utils.
(2) Handle configurations with only one convolutional layer where both first_layer and last_layer are true
(3) Do not append multiple layers in the decoder block when doing a purely linear decoding. This is because multiple linear layers stacked on top of each other without activations can be cause more erratic training / numerical instability.
Reworking of MACE for cleanliness, robustness to hyperparameter configurations, and numerical stability in decoders.
Main Changes:
(1) Many changes to cleanliness and comments, moving some blocks to the utils. (2) Handle configurations with only one convolutional layer where both first_layer and last_layer are true (3) Do not append multiple layers in the decoder block when doing a purely linear decoding. This is because multiple linear layers stacked on top of each other without activations can be cause more erratic training / numerical instability.