Add other (sequence) regression tasks

acxz / pl-utils

Some models and trainer scripts created using pytorch-lightning.

GNU General Public License v3.0

3 stars 1 forks source link

Normalization: Use LayerNorms (there is also RMSENorm) before each layer (what about instance/group norms?)

maybe: before computing the loss pass the output and target through a normalization (maybe another layernorm), to ensure normalized gradients? Only have this part in train/val/test but not in predict.

Activation functions: mish better than silu? Also should it be input -> [layernorm -> actfunc -> linear layer] -> ... -> last linear layer (i.e output) (-> opt for train/val/test: layernorm -> loss)

acxz / pl-utils

Add other (sequence) regression tasks #6