yegor256 / qulice

Quality Police for Java projects: aggregator of Checkstyle and PMD
https://www.qulice.com
Other
301 stars 112 forks source link

PMD exception in Java source with variables or methods named with unicode characters for Windows workflow #1267

Closed c71n93 closed 6 months ago

c71n93 commented 6 months ago

PMD fails to analyze Java source with variables or methods named with unicode characters. The exception only appears in windows workflow. PMD failed in this (https://github.com/objectionary/eo/pull/3171) PR, for this (https://github.com/objectionary/eo/actions/runs/8973844023/job/24644892711?pr=3171) workflow.

Example of error message:

2024-05-06T18:35:20.8319792Z [INFO] PMD: D:\a\eo\eo\eo-runtime\src\main\java\org\eolang\AtComposite.java[unknown]: PMDException: Error while processing D:\a\eo\eo\eo-runtime\src\main\java\org\eolang\AtComposite.java: net.sourceforge.pmd.PMDException: Error while processing D:\a\eo\eo\eo-runtime\src\main\java\org\eolang\AtComposite.java
2024-05-06T18:35:20.8325885Z    at net.sourceforge.pmd.SourceCodeProcessor.processSourceCodeWithoutCache(SourceCodeProcessor.java:128)
2024-05-06T18:35:20.8327642Z    at net.sourceforge.pmd.SourceCodeProcessor.processSourceCode(SourceCodeProcessor.java:100)
2024-05-06T18:35:20.8329641Z    at net.sourceforge.pmd.SourceCodeProcessor.processSourceCode(SourceCodeProcessor.java:62)
2024-05-06T18:35:20.8333094Z    at net.sourceforge.pmd.processor.PmdRunnable.call(PmdRunnable.java:89)
2024-05-06T18:35:20.8336483Z    at net.sourceforge.pmd.processor.MonoThreadProcessor.runAnalysis(MonoThreadProcessor.java:32)
2024-05-06T18:35:20.8341693Z    at net.sourceforge.pmd.processor.AbstractPMDProcessor.processFiles(AbstractPMDProcessor.java:143)
2024-05-06T18:35:20.8344982Z    at net.sourceforge.pmd.processor.AbstractPMDProcessor.processFiles(AbstractPMDProcessor.java:123)
2024-05-06T18:35:20.8347679Z    at net.sourceforge.pmd.PMD.processFiles(PMD.java:322)
2024-05-06T18:35:20.8349328Z    at com.qulice.pmd.SourceValidator.validateOne(SourceValidator.java:130)
2024-05-06T18:35:20.8351728Z    at com.qulice.pmd.SourceValidator.validate(SourceValidator.java:105)
2024-05-06T18:35:20.8353447Z    at com.qulice.pmd.PmdValidator.validate(PmdValidator.java:67)
2024-05-06T18:35:20.8355075Z    at com.qulice.maven.CheckMojo$ValidatorCallable.call(CheckMojo.java:237)
2024-05-06T18:35:20.8357390Z    at com.qulice.maven.CheckMojo$ValidatorCallable.call(CheckMojo.java:205)
2024-05-06T18:35:20.8359571Z    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2024-05-06T18:35:20.8361442Z    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2024-05-06T18:35:20.8364028Z    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2024-05-06T18:35:20.8365987Z    at java.base/java.lang.Thread.run(Thread.java:829)
2024-05-06T18:35:20.8369028Z Caused by: net.sourceforge.pmd.lang.ast.TokenMgrError: Lexical error at line 61, column 22.  Encountered: "\u2020" (8224), after : "" (in lexical state 0)
2024-05-06T18:35:20.8373051Z    at net.sourceforge.pmd.lang.java.ast.JavaParserTokenManager.getNextToken(JavaParserTokenManager.java:2534)
2024-05-06T18:35:20.8375895Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.jj_consume_token(JavaParser.java:14281)
2024-05-06T18:35:20.8379388Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.PrimarySuffix(JavaParser.java:5270)
2024-05-06T18:35:20.8382917Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.PrimaryExpression(JavaParser.java:4680)
2024-05-06T18:35:20.8385610Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.PostfixExpression(JavaParser.java:4494)
2024-05-06T18:35:20.8388757Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.UnaryExpressionNotPlusMinus(JavaParser.java:4392)
2024-05-06T18:35:20.8391888Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.UnaryExpression(JavaParser.java:4269)
2024-05-06T18:35:20.8394940Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.MultiplicativeExpression(JavaParser.java:4184)
2024-05-06T18:35:20.8397995Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.AdditiveExpression(JavaParser.java:4131)
2024-05-06T18:35:20.8400882Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.ShiftExpression(JavaParser.java:4074)
2024-05-06T18:35:20.8403983Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.RelationalExpression(JavaParser.java:4013)
2024-05-06T18:35:20.8406696Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.InstanceOfExpression(JavaParser.java:3941)
2024-05-06T18:35:20.8409813Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.EqualityExpression(JavaParser.java:3686)
2024-05-06T18:35:20.8412868Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.AndExpression(JavaParser.java:3646)
2024-05-06T18:35:20.8415768Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.ExclusiveOrExpression(JavaParser.java:3606)
2024-05-06T18:35:20.8418568Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.InclusiveOrExpression(JavaParser.java:3566)
2024-05-06T18:35:20.8421940Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.ConditionalAndExpression(JavaParser.java:3526)
2024-05-06T18:35:20.8424760Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.ConditionalOrExpression(JavaParser.java:3486)
2024-05-06T18:35:20.8427832Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.ConditionalExpression(JavaParser.java:3448)
2024-05-06T18:35:20.8430697Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.Expression(JavaParser.java:3307)
2024-05-06T18:35:20.8435868Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.ReturnStatement(JavaParser.java:6990)
2024-05-06T18:35:20.8446583Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.Statement(JavaParser.java:5799)
2024-05-06T18:35:20.8450644Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.BlockStatement(JavaParser.java:5971)
2024-05-06T18:35:20.8453924Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.Block(JavaParser.java:5888)
2024-05-06T18:35:20.8458729Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.MethodDeclaration(JavaParser.java:2201)
2024-05-06T18:35:20.8462293Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.ClassOrInterfaceBodyDeclaration(JavaParser.java:1855)
2024-05-06T18:35:20.8465229Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.ClassOrInterfaceBody(JavaParser.java:1808)
2024-05-06T18:35:20.8468312Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.ClassOrInterfaceDeclaration(JavaParser.java:936)
2024-05-06T18:35:20.8471294Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.TypeDeclaration(JavaParser.java:838)
2024-05-06T18:35:20.8474144Z    at net.sourceforge.pmd.lang.java.ast.JavaParser.CompilationUnit(JavaParser.java:558)
2024-05-06T18:35:20.8477198Z    at net.sourceforge.pmd.lang.java.AbstractJavaParser.parse(AbstractJavaParser.java:62)
2024-05-06T18:35:20.8479692Z    at net.sourceforge.pmd.lang.AbstractParser.doParse(AbstractParser.java:45)
2024-05-06T18:35:20.8484630Z    at net.sourceforge.pmd.SourceCodeProcessor.parse(SourceCodeProcessor.java:136)
2024-05-06T18:35:20.8487198Z    at net.sourceforge.pmd.SourceCodeProcessor.processSource(SourceCodeProcessor.java:200)
2024-05-06T18:35:20.8489778Z    at net.sourceforge.pmd.SourceCodeProcessor.processSourceCodeWithoutCache(SourceCodeProcessor.java:118)
2024-05-06T18:35:20.8491860Z    ... 16 more
2024-05-06T18:35:20.8492229Z  (ProcessingError)

Part of the Java source file (eo\eo-runtime\src\main\java\org\eolang\AtComposite.java) related to this particular error:

...
    @Override
    public String toString() {
        return this.φTerm(); // line: 61
    }
...

There was similar issues in PMD (https://github.com/pmd/pmd/issues/3423) but as I understand, they were resolved in the PMD version that qulice currently uses.

@yegor256 I was hoping to see why windows CI doesn't crash with the same error in this project, but I found that there is no tests in qulice-pmd that contains unicode characters. I think it's worth adding these tests to understand the cause of the error that I described above.

github-actions[bot] commented 6 months ago

@c71n93 thanks for the report, here is a feedback:

Problems

I would recommend including clear, step-by-step instructions to reproduce the error in the bug report.

Please fix the bug report in order it to get resolved faster. Analyzed with gpt-4

c71n93 commented 6 months ago

@yegor256 As expected this problem occured in this project too: in this PR (#1268), in windows workflow (https://github.com/yegor256/qulice/actions/runs/9003058334/job/24732811660?pr=1268).

yegor256 commented 6 months ago

@pnatashap may you can take a look?

pnatashap commented 6 months ago

What is strange, the error is the same "Lexical error at line 61, column 22. Encountered: "\u2020" (8224), after : "" (in lexical state 0) 2024-05-06T18:35:20.8373051Z at net.sourceforge.pmd.lang.java.ast.JavaParserTokenManager.getNextToken(JavaParserTokenManager.java:2534)" and φ is not a '\u2020' symbol (and is not considered as a good symbol in PMD, that's why we have a error). Looks like something is wrong with encoding, will check on windows env

pnatashap commented 6 months ago

@c71n93 the reason is in PMD, they use system property file.encoding to get default encoding, on Windows env it have Cp1251 (Check https://github.com/pmd/pmd/blob/master/pmd-core/src/main/java/net/sourceforge/pmd/AbstractConfiguration.java#L38 it is still the same in current version). So you need to set it up at least for now. Example - https://github.com/pnatashap/qulice/pull/9

@yegor256 There two variants to fix: 1. Keep code as is, because it can be fixed using system variables 2. add a parameter to set up encoding explicitly (also MVN parameter project.build.sourceEncoding can be used)

yegor256 commented 6 months ago

@pnatashap I believe, sourceEncoding is set correctly in the pom.xml where the problem was reported. @c71n93 am I right?

c71n93 commented 6 months ago

@yegor256 as far as I understand, no, sourceEncoding is not set in the pom.xml anywhere in eo.

yegor256 commented 6 months ago

@c71n93 eo project uses jcabi-parent (as a parent POM), where this is set: https://github.com/jcabi/jcabi-parent/blob/master/pom.xml#L96-L97

pnatashap commented 6 months ago

@yegor256 @c71n93 sourceEncoding doesn't help for now, only env variables will help. It is possible to set up encoding explicitly for PMD via context using this method (and get this data from sourceEncoding) https://github.com/pmd/pmd/blob/ef3455348603aa25f86894b9930f05f141f44d20/pmd-core/src/main/java/net/sourceforge/pmd/AbstractConfiguration.java#L41-L43

For now workaround is to set up

JAVA_OPTS: "%JAVA_OPTS% -Dfile.encoding=UTF-8"
pnatashap commented 6 months ago

@c71n93 please check PR #1273 on your project