xjl219 / cld2

Automatically exported from code.google.com/p/cld2
0 stars 0 forks source link

Add possibility to set MinReliableKeepPercent #36

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. try to detect the language of attached input file
2. see the output is "unknown"

What is the expected output? What do you see instead?
I would expect either 'perssian' or 'arabic'

What version of the product are you using? On what operating system?
rev195 on centos 7

Please provide any additional information below.

CLD2 returns "unknown" because the reliability is lower than 
kMinReliableKeepPercent (in compact_lang_det_impl.cc) :
static const int kMinReliableKeepPercent = 41;  // Remove lang if reli < this

Would adding an additional parameter to the DetectLanguageXXX(...) in order to 
set this threshold be acceptable ?

Regards

Original issue reported on code.google.com by William....@gmail.com on 11 Jun 2015 at 5:07

Attachments:

GoogleCodeExporter commented 9 years ago
That's a good suggestion. I'd really like to see us consider an alternative 
scheme where we use the builder pattern to construct a settings/config object, 
so that we can keep the API as stable as possible while accommodating 
reasonable requests for behavioral changes like this.

Jason/Dick, what do you think?

Original comment by andrewha...@google.com on 11 Jun 2015 at 8:17