bitextor / bicleaner

Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
GNU General Public License v3.0
150 stars 22 forks source link

Litetrain #12

Closed mbanon closed 5 years ago

mbanon commented 5 years ago

Integrating litetrain (lite training files) into master (generic tokenizer by lpla)