languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
12.2k stars 1.38k forks source link

[en] LanguageTool prefers proper over common adjectives, even when meaning differs #7768

Open dgdguk opened 1 year ago

dgdguk commented 1 year ago

LanguageTool seems to have an issue where it prefers proper adjectives over common adjectives, even when the common adjective has a substantially different meaning to the proper adjective. According to LanguageTool:

  1. "A platonic relationship." is incorrect due to miscapitalising "platonic".
  2. "A Platonic relationship." is correct, but the meaning is different from the first case - and almost certainly not the intended meaning.

The issue appears to be confusion between the proper adjective and common adjective forms of "platonic". In this case, the proper adjective "Platonic", means pertaining to the Greek philosopher Plato. The common adjective, "platonic", means "non-sexual". For fairly obvious reasons, outside of discussions about Greek philosophy, the common adjective form of "platonic" is by far the most common usage.

I'm not sure if this is a one-off exception in English or if there are other word pairs that this applies to, where the proper and common forms of the word have substantial enough differences in meaning.

I am also not sure what the expected result would be here, as it might not be possible for LanguageTool to differentiate between the two cases. However, I do think that LanguageTool should probably treat both cases equally, rather than preferring the proper adjective version which appears to be the case now.

MikeUnwalla commented 1 year ago

It is a platonic relationship.

With the test sentence, I get this message:

Message: Possible incorrect capitalization. Some adjectives that are derived from proper nouns can have both an initial capital letter and an initial lower case letter. If there is a close association with the proper noun, use Platonic. (deactivate)
Correction: Platonic
Context: It is a platonic relationship.
More info: https://languagetool.org/insights/post/spelling-capital-letters/

LT does not 'prefer' a capitalized adjective. LT is not 'confused'. It tells you that a capitalized adjective is a possible option. Your job is to decide whether the capitalized adjective is better than a lower-case adjective.

If rule is not useful, you can deselect it.

@languagetool-org/developers, possibly change the message from "Possible incorrect capitalization" to "Possible alternative capitalization".

tiff commented 1 year ago

I will add an antipattern for cases where "platonic" refers to a relationship/friendship/love, and I will change the message as you suggested @MikeUnwalla

Sources: https://www.merriam-webster.com/dictionary/platonic https://en.wiktionary.org/wiki/platonic#Adjective

dgdguk commented 1 year ago

@MikeUnwalla The "full" message isn't the entire user experience. I first encountered this with issue with the LibreOffice extension, where the context message is given as "Capitalization: words from proper nouns", which both lacks the ambiguity of the "possible" phrasing in the full message and mistakenly claims that platonic is a word from a proper noun.

Further, the context menu for the explanation links to https://languagetool.org/insights/post/spelling-capital-letters/, which states both "Proper nouns are always capitalized" and "Keep in mind that proper adjectives should be capitalized, too." Combined with the claim in the context message that platonic comes from a proper noun, the advice is incorrect.

I also did some more testing and proper adjectives which do not have a common form get flagged as a spelling error (e.g. Kafkaesque). In other words, it appears that if LanguageTool can determine that there is only a proper nouns/adjectives, it's a spelling error. Otherwise, such as this case, common nouns/adjectives are flagged as "possible errors", and proper nouns/adjectives are not flagged as possible errors. Hence my comment about LanguageTool being confused and preferring proper forms: This rule only seems to trigger when LanguageTool cannot determine if the proper or common form should be used (confused), and only flags common forms as being possibly incorrect (prefers proper forms). While I don't think that the confusion aspect can be addressed easily, if this rule isn't to be seen as preferring proper forms, it should flag both proper and common forms as being "possibly incorrect" or having "possible alternative capitalizations".

@tiff The antipattern you suggested doesn't really fix the problem. If you look at the example usages in the Merriam-Webster link, most of them would still trigger this e.g. "platonic adoration", "platonic bromance", "mostly platonic", "platonic ideal".

The problem is that if LanguageTool cannot determine if a common or proper form is required, then the rule should flag both common and proper forms as having possible alternative capitalizations. This would obviously result in more false positives, however. Given that this particular rule seems like it only triggers when LanguageTool cannot determine what is correct (as spell checking detects words which do not have both common and proper forms), the rule may not be appropriate for the default rule set.

dgdguk commented 1 year ago

Some additional information: I've now found some additional words which fit the same properties as platonic: words which have both a proper and common form. My testing (which may not be exhaustive, because I do not know what patterns/antipatterns get applied) seems to suggest that this rule is applied very inconsistently. Here is a list of words, the proper noun they derive from, and their behaviour with LanguageTool:

Words which have "possible incorrect capitalisation" when using lower case: platonic (Plato), herculean (Hercules), draconian (Draco)

Words which do not have "possible incorrect capitalisation" for either lower or upper case: gargantuan (Gargantua), titanic (Titan), caesarian (Caesar), spartan (Sparta).

The degree to which the common/proper form of these words differ varies - herculean/Herculean has almost the same meaning and it could be argued that it should be a proper noun, for example. However for the rest of them this isn't really the case. Suggesting the capitalisation of draconian is particularly egregious; neither Merriam-Webster or Cambridge dictionaries define "Draconian - referring to the lawmaker Draco". The OED does, but with a usage example from 1877.

Hence I think there is a discussion to be had as to exactly what this rule is trying to achieve, and how it goes about it.

Sources: https://www.merriam-webster.com/dictionary/draconian https://dictionary.cambridge.org/dictionary/english/draconian https://www.oed.com/view/Entry/57378