HarikalarKutusu / cv-tbox-split-maker

Checks diversity in Mozilla Common Voice default or alternative splits for multiple versions and languages
Mozilla Public License 2.0
1 stars 0 forks source link

[FR] Add one more splitting algorithm for testing #1

Closed HarikalarKutusu closed 9 months ago

HarikalarKutusu commented 1 year ago

nv: seNtences-first w. unique Voices

Most probably there will be cases that this algorithm will fail to use the whole dataset as it tries to enforce both sentence and voice diversity.

HarikalarKutusu commented 9 months ago

We added two other (vw & vx), but will not ad the one above.