I have a data set of (very) short texts in one of the main three Swiss languages (German, French and Italian). I already removed all profiles apart from those three from langdetect/profiles. Still I get this:
from langdetect import detect_langs
detect_langs("Motorrad") # should return 'de'
[it:0.9999975589209744]
I am trying to fix this by adding a prior map that resembles the Swiss population: prior_map = {'de':0.65, 'fr':0.25, 'it':0.1}, which would skew the behavior in the right direction.
I saw that the detector class has a set_prior_map method but I am not able to either instantiate it not to set the prior for all detectors. Any idea?
I have a data set of (very) short texts in one of the main three Swiss languages (German, French and Italian). I already removed all profiles apart from those three from
langdetect/profiles
. Still I get this:I am trying to fix this by adding a prior map that resembles the Swiss population:
prior_map = {'de':0.65, 'fr':0.25, 'it':0.1}
, which would skew the behavior in the right direction.I saw that the
detector
class has aset_prior_map
method but I am not able to either instantiate it not to set the prior for all detectors. Any idea?