Experiment with randomly initialized encoder

A first experiment with a randomly initialized encoder and decoder is pudgy-jear (hidden commit: e7a7ab7). Check out this experiment with dvc exp apply pudgy-jear to inspect its associated model.

This model was trained with a randomly initialized version of dbmdz/bert-base-historic-multilingual-cased as the encoder and as the decoder with 100_000 training examples from dtak-1600-1699 (CAB normalized) for 3 epochs. At first glance, the predictions of this model look worse than those of a model that was initialized with a pre-trained historic encoder. However, the loss is still decreasing in epoch 3, so with enough training this model might still perform equally well as one without pre-training. This should be investigated further.

ybracke / transnormer

Experiment with randomly initialized encoder #55