aschaeffer / language-detection

Automatically exported from code.google.com/p/language-detection
0 stars 0 forks source link

Adding support for loading profiles from the langdetect jar-file #51

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I finished the work on the jar variant. And pushed it to:
https://kurz.alkacon@code.google.com/r/kurzalkacon-fromzip/

Changes I've made are:
1. Added a method DetectorFactory#loadProfiles()
2. Added an ant build.xml to the project's root

We would appreciate if you could offer an official distribution with support 
for loading profiles from the langdetect jar-file. Please let us know if and 
when this would be possible.

Original issue reported on code.google.com by kurz.alk...@gmail.com on 12 Feb 2013 at 4:19

Attachments:

GoogleCodeExporter commented 9 years ago
Is this already implemented?

Original comment by sa...@muhive.com on 28 Nov 2013 at 6:46

GoogleCodeExporter commented 9 years ago
Anyway it would be great to see that feature, but AFAIK this patch has not been 
accepted for now.

In order to avoid duplicated code or having a package named 
com.cybozu.labs.langdetect (needed to overwrite methods with visibility 
package) within our core product, we are currently using the binaries from the 
maven repository, containing all the languge profiles in it and the following 
code for initialization:

    DetectorFactory.clear();
    DetectorFactory.setSeed(42L);
    DetectorFactory.loadProfile(loadProfiles(getAvailableLocales()));

    /**
     * Load the profiles from the classpath.<p>
     * 
     * @param locales the locales to initialize.<p>
     * 
     * @return a list of profiles
     * 
     * @throws Exception if something goes wrong
     */
    private List<String> loadProfiles(List<Locale> locales) throws Exception {

        List<String> profiles = new ArrayList<String>();
        for (Locale locale : locales) {
            String lang = locale.getLanguage();
            String profileFile = "profiles" + "/" + lang;
            InputStream is = getClass().getClassLoader().getResourceAsStream(profileFile);
            String profile = IOUtils.toString(is, "UTF-8");
            if ((profile != null) && (profile.length() > 0)) {
                profiles.add(profile);
            }
            is.close();
        }
        return profiles;
    }

Original comment by kurz.alk...@gmail.com on 28 Nov 2013 at 1:46