ybracke / transnormer

A lexical normalizer for historical spelling variants using a transformer architecture.
GNU General Public License v3.0
6 stars 1 forks source link

Experiment with randomly initialized encoder #55

Open ybracke opened 1 year ago

ybracke commented 1 year ago

How well does the model work if we replace the pre-trained encoder (and decoder) with a randomly initialized one (Rnd2Rnd)?

ybracke commented 1 year ago

A first experiment with a randomly initialized encoder and decoder is pudgy-jear (hidden commit: e7a7ab7). Check out this experiment with dvc exp apply pudgy-jear to inspect its associated model.

This model was trained with a randomly initialized version of dbmdz/bert-base-historic-multilingual-cased as the encoder and as the decoder with 100_000 training examples from dtak-1600-1699 (CAB normalized) for 3 epochs. At first glance, the predictions of this model look worse than those of a model that was initialized with a pre-trained historic encoder. However, the loss is still decreasing in epoch 3, so with enough training this model might still perform equally well as one without pre-training. This should be investigated further.