languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
12.43k stars 1.4k forks source link

[de] remove obsolete dative forms #5830

Open udomai opened 3 years ago

udomai commented 3 years ago

We have 2,996 obsolete forms for SUB:DAT:SIN:MAS in the dictionary. They're a problem especially when we want to synthesize that form and have to suggest the antiquated form as well.

Ich bin einem geistreichen Witze nicht abgeneigt.

starkeSUBSINMAS.txt

I'll test this in a branch to see how many rules will have to be changed.

udomai commented 3 years ago

Only 2 little things had to be fixed. Running this change on the test server to test the effect on the diff.

udomai commented 3 years ago

In Premium, there were only 24 problems. The premium test branch is here: https://github.com/languagetooler-gmbh/languagetool-premium/commit/74299403a7ff6e74eee6c61d3aac021f4aa5760b

Testing this, too.

udomai commented 3 years ago

The above PR is missing one thing: Adding words to spelling.txt that will no longer be in the dictionary after this (e.g. "Manne"). Is there a smart way of checking for which of the new entries in removed.txt that is necessary?

I think I have a solution. Trying now.

udomai commented 3 years ago

I wrote a python script that counted how many times the entries in the above list are in the German dictionary. I converted those who only had one entry (i.e. the one we removed) into a list that I can now put into spelling.txt.

old-dat-forms-for-spelling.txt

This all helps improve Suggestions in rules like DE_AGREEMENT, see https://github.com/languagetool-org/languagetool/issues/5470