Hi, Thanks for giving us this implementation. I really appreciate it.
I'm a bit new to training Enc-Dec models. so I was wondering if you could answer this one question.
If my understanding is correct, the regular pre-training objective of the T5 is very similar to MLM, as in you mask some tokens and have the model learn to predict them. so I want to know if, say, instead of masking tokens, I corrupt my whole dataset (20% of each row) by replacing the tokens with other tokens (not using any fancy generator-discriminator, just corrupting the data during the pre-processing step) and treat it like a grammar / typo correction task where the labels are the original, clean text itself; could be a viable objective?
input:"the katt jamped over the fense"
label: "the cat jumped over the fence"
Hi, Thanks for giving us this implementation. I really appreciate it. I'm a bit new to training Enc-Dec models. so I was wondering if you could answer this one question.
If my understanding is correct, the regular pre-training objective of the T5 is very similar to MLM, as in you mask some tokens and have the model learn to predict them. so I want to know if, say, instead of masking tokens, I corrupt my whole dataset (20% of each row) by replacing the tokens with other tokens (not using any fancy generator-discriminator, just corrupting the data during the pre-processing step) and treat it like a grammar / typo correction task where the labels are the original, clean text itself; could be a viable objective?
input:
"the katt jamped over the fense"
label:"the cat jumped over the fence"
may I ask you to tell me what you think on this?