jbrry / Irish-BERT

Repository to store helper scripts for creating an Irish BERT model.
Other
9 stars 0 forks source link

Effect of filtering (near) duplicates #73

Open jowagner opened 3 years ago

jowagner commented 3 years ago

What would happen if we add OpusFilter, or some other (near) duplicate removal tool, to the pipeline?

Literature: