Open hattewarsm opened 5 years ago
@hattewarsm are you referring to something like:
List<LanguageProfile> languageProfiles = new LanguageProfileReader().readAllBuiltIn();
LanguageDetector detector = LanguageDetectorBuilder.create(NgramExtractors.standard())
.withProfiles(languageProfiles)
.build();
Optional<LdLocale> detected = detector.detect("コンコルド001試作機は1969年3月2日にトゥールーズで初飛行した");
and detected
has value Optional.absent()
?
I tested a few more examples:
hello
-> absenthello world, how are you doing?
-> absenthello world, how are you doing? This string is obviously English!
-> Optional.of(en)
This detector requires the most confident language detected to have >= 0.9999 confidence. This does seem rather high. Confidence below this returns Optional.absent()
.
You may be better off using detector.getProbabilities
and taking the most confident language (.get(0)
- they're sorted).
If this isn't the case, I think you'd have to give more information for the ticket not to be rejected.
You might want to give some more info :smiley: