languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
11.81k stars 1.38k forks source link

[de] German synthesizer problems #6981

Open tiff opened 1 year ago

tiff commented 1 year ago

When using XML rules, it is often not possible to transform a (hyphenated) compound noun to a different form.

E.g., this specific rule that transforms the noun Diabetes-Zentrum to genitive case does not work (try it in the rule editor):

<rule>
    <pattern>
        <token>des</token>
        <token postag="SUB.*" postag_regexp="yes">Diabetes-Zentrum</token>
    </pattern>
    <message>Foobar</message>
    <suggestion>des <match no="2" postag_regexp="yes" postag="SUB(.+)NOM(.+)" postag_replace="SUB$1GEN$2" /></suggestion>
    <example correction="des Diabetes-Zentrums">Der Kühler <marker>des Diabetes-Zentrum</marker></example>
</rule>

But our German agreement rule (DE_AGREEMENT) is somehow capable of transforming between these forms: Bildschirmfoto 2022-08-02 um 14 50 38

I think solving this will increase apply rate for high-impact rules like PRAEP_DAT, PRAEP_GEN etc.

@danielnaber maybe you can share some insights and maybe this is something for @jaumeortola to look into.

danielnaber commented 1 year ago

Mostly solved for nouns, I will later try to also cover adjectives like "gelb-grün".