optimaize / language-detector

Language Detection Library for Java
Apache License 2.0
567 stars 165 forks source link

Re-add "seed" parameter #14

Closed fabiankessler closed 9 years ago

fabiankessler commented 10 years ago

I had removed the seed parameter recently because I did not understand its use. Re-add named "consistent-results" or so.

This needs to go into the LanguageDetector, and the CommandLineInterface is affected too. See git history.

danielnaber commented 9 years ago

A constant seed helps (at least) for the case when you feed a non-supported language into the detector. Without that seed, the result will always be different, which is quite confusing.

danielnaber commented 9 years ago

Is there a plan to have a new release with this issue fixed anytime soon? We rely on language-detector and we're running into problems if this isn't fixed (our next release is planned for end of March).

fabiankessler commented 9 years ago

We're just looking at this right now, and came up with a suggestion. We believe that everyone wants consistent results. And not everyone needs to learn what a seed is.

Current situation: a non-seeded randomizer is used, resulting in inconsistent results.

Suggested solution:

If one would still want to have the behavior as it is as of now, he'd have to pass in a randomized seed.

fabiankessler commented 9 years ago

It was implemented as described on 2015-01-30. No feedback was given, I'm closing this task. This change makes it into the 0.5 release.