Hello,
I'm currently discovering and testing your API and it's great!
However, I'm a bit surprised by the results I get. I noticed that for a
multi-language (2) text, if I mix the sentences I get different scores, even
though every single word/sentence is the same, just their order differs.
Here is the text I test and the result I get:
Wie geht es Ihnen? Es geht mir gut, Danke! Bonjour, le soleil brille et les
oiseaux chantent.
fr:0.5714289303903483
de:0.42856987956922343
Wie geht es Ihnen? Bonjour, le soleil brille et les oiseaux chantent. Es geht
mir gut, Danke!
fr:0.8571417085595199
de:0.14285781020198818
Bonjour, le soleil brille et les oiseaux chantent. Wie geht es Ihnen? Es geht
mir gut, Danke!
fr:0.5714280769513275
de:0.4285703999762243
Could you explain to me what generates these inconsistences? If the detection
is only based on the frequency of 1, 2 and 3 -letters patterns, then I don't
understand this behaviour.
My version is langdetect-09-13-2011.zip and my operating system windows 7
64-bit.
Thanks a lot for your explanation.
Original issue reported on code.google.com by iferra...@gmail.com on 30 Oct 2013 at 7:40
Original issue reported on code.google.com by
iferra...@gmail.com
on 30 Oct 2013 at 7:40