isahb / sls-spellchecker

A Serverless Spell Checker
MIT License
6 stars 6 forks source link

Internal server error with German language #6

Open goaround opened 4 years ago

goaround commented 4 years ago

First: Great idea to run LanguageTool serverless. I think it is a perfect and simple use case.

My problem: I tried to use German as you described it in the Readme. The build runs fine but when I deploy the function, I just get an error as the response:

{"message": "Internal server error"}

You can see my changes here: https://github.com/goaround/sls-spellchecker/tree/german

I also tried the default AmericanEnglish. That works fine.

isahb commented 4 years ago

Hi @goaround I think the problem is lambda timed out on cold start. I increased timeout to 1m as German jar is bigger and may take longer on cold start.

goaround commented 4 years ago

The error comes back pretty fast.

I found an error in the logs:

java.lang.ExceptionInInitializerError: java.lang.ExceptionInInitializerError
java.lang.ExceptionInInitializerError
    at org.languagetool.language.German.getRelevantRules(German.java:135)
    at org.languagetool.language.NonSwissGerman.getRelevantRules(NonSwissGerman.java:38)
    at org.languagetool.JLanguageTool.getAllBuiltinRules(JLanguageTool.java:424)
    at org.languagetool.JLanguageTool.<init>(JLanguageTool.java:284)
    at org.languagetool.JLanguageTool.<init>(JLanguageTool.java:312)
    at org.languagetool.JLanguageTool.<init>(JLanguageTool.java:225)
    at com.isahb.slsspellchecker.Handler.<init>(Handler.java:37)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
Caused by: java.lang.IllegalArgumentException: 'de' is not a language code known to LanguageTool. Supported language codes are: en, en-AU, en-CA, en-GB, en-NZ, en-US, en-ZA. The list of languages is read from META-INF/org/languagetool/language-module.properties in the Java classpath. See http://wiki.languagetool.org/java-api for details.
    at org.languagetool.Languages.getLanguageForShortCode(Languages.java:218)
    at org.languagetool.Languages.getLanguageForShortCode(Languages.java:195)
    at org.languagetool.rules.de.SpellingData.<init>(SpellingData.java:39)
    at org.languagetool.rules.de.OldSpellingRule.<clinit>(OldSpellingRule.java:35)
    ... 11 more
isahb commented 4 years ago

Are you sure you're doing a mvn clean install before deployemnt after adding language-de in pom.xml?

goaround commented 4 years ago

Yes, I jun mvn clean install right before sls deploy --region eu-central-1

The sls-spellchecker-1.0.1.jar file is just 63.33 MB. Is this right? Thought I should be a few GBs? Or is the language file downloaded on the fly?

isahb commented 4 years ago

It's 49MB in my environment with langauge-de 5.0 in classpath. That's all.

<dependency>
    <groupId>org.languagetool</groupId>
    <artifactId>language-de</artifactId>
    <version>5.0</version>
    <exclusions>
        <exclusion>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-compiler</artifactId>
        </exclusion>
        <exclusion>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-reflect</artifactId>
        </exclusion>
        <exclusion>
            <groupId>com.typesafe.akka</groupId>
            <artifactId>akka-actor</artifactId>
        </exclusion>
    </exclusions>
</dependency>

I don't receive your error when deploying the german package. However, sometimes the API gateway times out because its limit is 30 seconds. Need to think about reducing the package size because cold starts with German are taking longer than 30 seconds sometimes!

goaround commented 4 years ago

Ok found the issue. I just added German like this

<dependency>
    <groupId>org.languagetool</groupId>
    <artifactId>language-de</artifactId>
    <version>5.0</version>
</dependency>

If I just replace language-en with language-de it works fine. The issue seems to be multiple dependencies.

A Request via curl works fine now but if I add the URL to my LanguageTool Browser Extension I get more errors :/ #1, code=undefined 🙈

isahb commented 4 years ago

The API is not compatible with Languagetool's I guess, it needs some work. You can follow this issue, I'll make it compatible in the next few days (BTW, it's a great use case - privacy, originally I implemented for a different purpose, so thanks for giving me the idea :) )

goaround commented 4 years ago

I have suspected that. Yes LanguageTool is awesome. My whole team is using it via the Browser Extension and currently I host the LanguageTool server via Docker https://github.com/silvio/docker-languagetool

But the Browser Extension makes a lot of requests and regularly the CPU usage is at 100%. Going serverless would be a great solution.