languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
12.21k stars 1.38k forks source link

[pt-br] user feedback "só que" #3004

Closed tiff closed 4 years ago

tiff commented 4 years ago

O aplicativo aponta ''só que'' como expressão oral e pede para revermos. Falta explicar e mostrar nas proposições quais são as possíveis sugestões e a explicação para tal (uso coloquial de: porém, mas, entretanto).

marcoagpinto commented 4 years ago

@tiff

I was trying to fix it.

My solution works well on the stand-alone tool except that TESTRULES PT throws many errors:

    <rule>
      <pattern>
          <token inflected='yes'>só</token>
          <token>que</token>
      </pattern>
      <message>Esta é uma expressão oral. Reveja.</message>
      <url>https://pt.wiktionary.org/wiki/só_que</url>
      <short>Coloquialismo</short>
      <suggestion>contudo</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>embora</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>entretanto</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>mas</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>porém</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>todavia</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <example correction='contudo|embora|entretanto|mas|porém|todavia'>Queria comprar a casa, <marker>só que</marker> não é possível.</example> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
    </rule>
marcoagpinto commented 4 years ago

Do you know what is wrong with my code?

Thanks!

tiff commented 4 years ago

What does the error message say? Maybe the order of XML tags is wrong. I think The <url> has to come after the suggestions.

marcoagpinto commented 4 years ago
Running XML pattern tests...
Known languages: [Arabic, Arabic (Algeria), English, English (US), English (GB), English (Australian), English (Canadian), English (New Zealand), English (South African), Persian, French, German, German (Germany), German (Austria), German (Swiss), Simple German, Polish, Catalan, Catalan (Valencian), Italian, Breton, Dutch, Portuguese, Portuguese (Portugal), Portuguese (Brazil), Portuguese (Angola preAO), Portuguese (Moçambique preAO), Russian, Asturian, Belarusian, Chinese, Danish, Esperanto, Irish, Galician, Greek, Japanese, Khmer, Romanian, Slovak, Slovenian, Spanish, Swedish, Tamil, Tagalog, Ukrainian, Testlanguage]
Running XML validation for pt/grammar.xml...
cvc-complex-type.2.4.a: Invalid content was found starting with element 'suggestion'. One of '{example}' is expected. Problem found at line 30749, column 16.
Exception in thread "main" java.io.IOException: Cannot load or parse '/org/languagetool/rules/pt/grammar.xml'
        at org.languagetool.XMLValidator.validateWithXmlSchema(XMLValidator.java:109)
        at org.languagetool.rules.patterns.PatternRuleTest.validatePatternFile(PatternRuleTest.java:200)
        at org.languagetool.rules.patterns.PatternRuleTest.validatePatternFile(PatternRuleTest.java:176)
        at org.languagetool.rules.patterns.PatternRuleTest.runTestForLanguage(PatternRuleTest.java:157)
        at org.languagetool.rules.patterns.PatternRuleTest.runGrammarRulesFromXmlTestIgnoringLanguages(PatternRuleTest.java:152)
        at org.languagetool.rules.patterns.PatternRuleTest.main(PatternRuleTest.java:683)
Caused by: org.xml.sax.SAXParseException; lineNumber: 30749; columnNumber: 16; cvc-complex-type.2.4.a: Invalid content was found starting with element 'suggestion'. One of '{example}' is expected.
        at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
        at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator$XSIErrorReporter.reportError(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.reportSchemaError(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.handleStartElement(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.startElement(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
        at com.sun.org.apache.xerces.internal.jaxp.validation.StreamValidatorHelper.validate(Unknown Source)
        at com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorImpl.validate(Unknown Source)
        at javax.xml.validation.Validator.validate(Unknown Source)
        at org.languagetool.XMLValidator.validateInternal(XMLValidator.java:204)
        at org.languagetool.XMLValidator.validateWithXmlSchema(XMLValidator.java:107)
        ... 5 more
Running disambiguator rule tests...
Running disambiguation tests for Portuguese...
371 rules tested (274ms)
Tests successful.
Running XML bitext pattern tests...
Tests successful.
Validating false-friends.xml...
Validation successfully finished.
tiff commented 4 years ago

Basically this is what it says:

Invalid content was found starting with element 'suggestion'. One of '{example}' is expected.

This means the order of XML tags is incorrect. Try:

<rule>
      <pattern>
          <token inflected='yes'>só</token>
          <token>que</token>
      </pattern>
      <message>Esta é uma expressão oral. Reveja.</message>
      <suggestion>contudo</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>embora</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>entretanto</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>mas</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>porém</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <suggestion>todavia</suggestion> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
      <url>https://pt.wiktionary.org/wiki/só_que</url>
      <short>Coloquialismo</short>
      <example correction='contudo|embora|entretanto|mas|porém|todavia'>Queria comprar a casa, <marker>só que</marker> não é possível.</example> <!-- Fixed by MARCOAGPINTO - 2020-06-02 -->
    </rule>
marcoagpinto commented 4 years ago

@tiff

2524 rules tested.
Exception in thread "main" org.languagetool.rules.patterns.PatternRuleTest$PatternRuleTestFailure: Test failure for rule INFORMALITIES[114] in file /org/languagetool/rules/pt/grammar.xml: Incorrect suggestions: contudo|embora|entretanto|mas|porém|todavia != contudo|entretanto|mas|porém|todavia on input: Queria comprar a casa, só que n?o é possível.
        at org.languagetool.rules.patterns.PatternRuleTest.assertSuggestions(PatternRuleTest.java:525)
        at org.languagetool.rules.patterns.PatternRuleTest.testBadSentences(PatternRuleTest.java:417)
        at org.languagetool.rules.patterns.PatternRuleTest.testGrammarRulesFromXML(PatternRuleTest.java:318)
        at org.languagetool.rules.patterns.PatternRuleTest.runTestForLanguage(PatternRuleTest.java:169)
        at org.languagetool.rules.patterns.PatternRuleTest.runGrammarRulesFromXmlTestIgnoringLanguages(PatternRuleTest.java:152)
        at org.languagetool.rules.patterns.PatternRuleTest.main(PatternRuleTest.java:683)
Running disambiguator rule tests...
Running disambiguation tests for Portuguese...
371 rules tested (249ms)
Tests successful.
Running XML bitext pattern tests...
Tests successful.
Validating false-friends.xml...
Validation successfully finished.

It still gives errors

tiff commented 4 years ago

See error message:

Exception in thread "main" org.languagetool.rules.patterns.PatternRuleTest$PatternRuleTestFailure: Test failure for rule INFORMALITIES[114] in file /org/languagetool/rules/pt/grammar.xml: Incorrect suggestions: contudo|embora|entretanto|mas|porém|todavia != contudo|entretanto|mas|porém|todavia on input: Queria comprar a casa, só que n?o é possível.

The suggestions don't match what is expected in the example.

marcoagpinto commented 4 years ago

@tiff

Sorry... it has worked right now.

Moments ago I removed the "embora" in the suggestions and forgot to remove in the example.

About to commit it.

Thanks!

marcoagpinto commented 4 years ago

Fixed:

https://github.com/languagetool-org/languagetool/commit/5b7bc1b9cbad9fa596f4fd92d2b360cab4c9e241