RangerWolf / language-detection

Automatically exported from code.google.com/p/language-detection
0 stars 0 forks source link

Long string without whitespaces will take forever to detect #6

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
If you split up the text before, it will work:

E.g. somthing à la:

    public static String cleanText(String text) {
        // break up big words > 40 chars into single ones
        int s = text.length();
        StringBuffer sb = new StringBuffer();
        int sindex = 0;
        for (int i = 0; i < s; i++) {
            char c = text.charAt(i);
            if (c == ' ') {
                sindex = i;
            }
            if (i - sindex > 40) {
                sindex = i;
                sb.append(" ");
            }
            sb.append(c);
        }
        return sb.toString();
    }

Original issue reported on code.google.com by tbr...@gmail.com on 4 Feb 2011 at 2:37

GoogleCodeExporter commented 9 years ago
I tried to reproduce your result, but couldn't.
Would you show a test data to reproduce?

Original comment by nakatani.shuyo on 7 Feb 2011 at 2:34

GoogleCodeExporter commented 9 years ago
See Issue 26 which is the same problem and fixed it.

Original comment by nakatani.shuyo on 18 Oct 2011 at 5:58