jshiell / checkstyle-idea

CheckStyle plug-in for IntelliJ IDEA
https://plugins.jetbrains.com/plugin/1065-checkstyle-idea
Other
887 stars 161 forks source link

Parsing Error with Character Literals in Checkstyle #625

Closed whustedt closed 7 months ago

whustedt commented 11 months ago

I'm experiencing a parsing error in Checkstyle with the message: "The source file could not be parsed by Checkstyle." This issue arises when using character literals like 'Ä'. Replacing these with string literals (e.g., "Ä") resolves the error.

Error Message:

The source file could not be parsed by Checkstyle.

Original Code Causing Issue:

import org.apache.commons.lang3.RegExUtils;

/**
* Checkstyle Test
*/
public class Example {
    private String austauschenUmlaute(String text) {
        if (text.indexOf('Ä') > -1) {
            text = RegExUtils.replaceAll(text, "Ä", "Ae");
        }
        if (text.indexOf('ä') > -1) {
            text = RegExUtils.replaceAll(text, "ä", "ae");
        }
        if (text.indexOf('Ö') > -1) {
            text = RegExUtils.replaceAll(text, "Ö", "Oe");
        }
        if (text.indexOf('ö') > -1) {
            text = RegExUtils.replaceAll(text, "ö", "oe");
        }
        if (text.indexOf('Ü') > -1) {
            text = RegExUtils.replaceAll(text, "Ü", "Ue");
        }
        if (text.indexOf('ü') > -1) {
            text = RegExUtils.replaceAll(text, "ü", "ue");
        }
        if (text.indexOf('ß') > -1) {
            text = RegExUtils.replaceAll(text, "ß", "ss");
        }
        return text;
    }
}

Seeking guidance or a workaround for this issue.

Thank you.

jshiell commented 11 months ago

We've seen this in the past where something hasn't been set to UTF-8. You're not on Windows, are you?

whustedt commented 11 months ago

Hello,

Thank you for your prompt response. Yes, I am currently testing on Windows. I have verified that the Java file, the Checkstyle XML configuration, and the suppressions file are all encoded in UTF-8 format.

Could you please confirm if the provided Java class parses correctly in your environment?

Thank you for your assistance.

jshiell commented 11 months ago

I'm afraid I couldn't reproduce the error on Mac OS (14.1.2, latest Liberica JDK 11, IDEA 2023.3, latest plugin release).

One difference on Mac OS/Linux is that the JVM builds normally default to UTF-8, i.e.

$ java -XshowSettings:properties -version 2>&1 | grep encoding 
    file.encoding = UTF-8
    native.encoding = UTF-8
    stderr.encoding = UTF-8
    stdout.encoding = UTF-8
    sun.io.unicode.encoding = UnicodeBig
    sun.jnu.encoding = UTF-8

Which JVM are you using, and do you know what the defaults are for it? One possibility is that there's somewhere in the pipeline that the char set is using the default, which may break it.

Also, I presume using Checkstyle directly (e.g on the command line) works, just to rule out internal Checkstyle issues?

whustedt commented 11 months ago

Hello,

Thank you for your suggestion. Adjusting the "Custom VM Options" in IntelliJ IDEA resolved the issue. Adding the line -Dfile.encoding=UTF8 allowed me to successfully parse the Java file with Checkstyle. Your advice to check the VM settings was key to solving this problem. Thank you again for your help!

jshiell commented 11 months ago

My pleasure - glad we got it working for you!