StephenNi / language-detection

Automatically exported from code.google.com/p/language-detection
0 stars 0 forks source link

same string of text combined differently shows different language percent or probability #79

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Downloaded library today and Using Detector Factory to load Profiles 
(default)
2. Ran the following code

  LangDetectSample sample = new LangDetectSample();
  sample.init("xxx/profiles");

 String eng1  = "This is a sample Text we want to test on the ers";
 String eng4  = "Este es un texto de ejemplo que queremos poner a prueba";

 System.out.println(sample.detectLangs(eng1+eng4).toString());
 System.out.println(sample.detectLangs(eng4+eng1).toString());

In the system out lines, I have only changed the order the strings/data formed. 

What is the expected output? What do you see instead?

Expected to see atleast similar output from both system.out statements
But the following out put displayed..

[es:0.5714273430650357, en:0.4285698135174021]
[en:0.5714273982695499, es:0.2857137375174477, fr:0.14285820285579312]

Why is "fr" coming by simply reversing the string addition??

What version of the product are you using? On what operating system?
Mac OS

Please provide any additional information below.

Original issue reported on code.google.com by srirama....@gmail.com on 11 Aug 2015 at 9:45