Open eu9ene opened 1 month ago
See paper: Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach.
This can be helpful for example for monolingual data where we have a lot of it ( all en-xx language pairs).
Related to #231
See paper: Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach.
This can be helpful for example for monolingual data where we have a lot of it ( all en-xx language pairs).
Related to #231