ghpaetzold / questplusplus

Pipelined quality estimation.
49 stars 14 forks source link

Can Chinese language use tokeniser.perl for tokenising? #50

Open Shireen35 opened 3 years ago

Shireen35 commented 3 years ago

What changes should be made in the config fine for chinese data and how do we generate truecase-model for chinese data ?Do we use the same method that we use for other languages or some other way?PLEASE PLEASE HELPPP

lspecia commented 3 years ago

You can generate truecase files using the WMT scripts, but I'm not sure it makes sense for Chinese...