About Pre-training objectives

Hi, Thanks for giving us this implementation. I really appreciate it. I'm a bit new to training Enc-Dec models. so I was wondering if you could answer this one question.

If my understanding is correct, the regular pre-training objective of the T5 is very similar to MLM, as in you mask some tokens and have the model learn to predict them. so I want to know if, say, instead of masking tokens, I corrupt my whole dataset (20% of each row) by replacing the tokens with other tokens (not using any fancy generator-discriminator, just corrupting the data during the pre-processing step) and treat it like a grammar / typo correction task where the labels are the original, clean text itself; could be a viable objective?

input:"the katt jamped over the fense" label: "the cat jumped over the fence"

may I ask you to tell me what you think on this?

PiotrNawrot / nanoT5

About Pre-training objectives #38