Closed tiff closed 1 week ago
The pull request introduces a significant expansion of LanguageTool's vocabulary across multiple language modules. Key changes include the addition of numerous proper nouns to spelling_global.txt
, updates to ignore lists and spelling dictionaries in both German and English, and the introduction of new rules in the disambiguation XML files for Spanish and French. These modifications aim to enhance the tool's ability to recognize and process a broader range of terms and linguistic constructs without altering existing structures.
File Path | Change Summary |
---|---|
languagetool-core/src/main/resources/org/languagetool/resource/spelling_global.txt | Added a large number of proper nouns related to various topics, including many Olympic entries. |
languagetool-language-modules/de/src/main/resources/org/languagetool/resource/de/hunspell/ignore.txt | Added two new entries: IZF and Martijn/S #name . |
languagetool-language-modules/de/src/main/resources/org/languagetool/resource/de/hunspell/spelling.txt | Added multiple new compound words and variations related to finance and other terms. |
languagetool-language-modules/de/src/main/resources/org/languagetool/resource/de/multitoken-suggest.txt | Added new multi-token suggestions: Fair Market Value/S , Bami Goreng/S , Sambal Ulek/S , Sambal Oelek/S . |
languagetool-language-modules/de/src/main/resources/org/languagetool/rules/de/remote-rule-filters.xml | Introduced a new rule for handling greetings followed by a comma. |
languagetool-language-modules/en/src/main/resources/org/languagetool/resource/en/hunspell/ignore.txt | Added numerous entries, including MBR , QBR , navbar , and Martijn . |
languagetool-language-modules/en/src/main/resources/org/languagetool/resource/en/hunspell/spelling.txt | Added new words and phrases, including outgroup and ingroup , and various scientific terms. |
languagetool-language-modules/en/src/main/resources/org/languagetool/resource/en/multiwords.txt | Added various multi-word terms with corresponding part-of-speech tags, including sambal ulek . |
languagetool-language-modules/en/src/main/resources/org/languagetool/rules/en/grammar.xml | Modified the TALK_AT_TO rule, adding antipatterns and changing its default state to "off". |
languagetool-language-modules/es/src/main/resources/org/languagetool/resource/es/disambiguation.xml | Added multiple new rules and modified existing ones to improve disambiguation in Spanish. |
languagetool-language-modules/fr/src/main/resources/org/languagetool/resource/fr/disambiguation.xml | Added new rules for handling nominal groups and ambiguous cases, and removed some unnecessary rules. |
languagetool-language-modules/fr/src/main/resources/org/languagetool/rules/fr/remote-rule-filters.xml | Introduced a new rule group for unnecessary punctuation handling. |
spelling_global.txt
, similar to the main PR's focus on expanding proper nouns in the same file.spelling_global.txt
, enhancing the vocabulary for spell-checking.ignore.txt
, which may relate to the overall expansion of recognized terms in the language tool, aligning with the main PR's goal of enhancing the knowledge base.Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
These enhancements aim to improve the overall user experience by providing more accurate language processing and spell-checking capabilities.