Closed leveretconey closed 8 months ago
Thanks for calling this out @leveretconey ! Sounds like an oversight; adding a bit of context here already.
We indeed read project.build.sourceEncoding
in MavenMojoProjectParser.listSourceFiles
and use that to parse source files for the project.
https://github.com/openrewrite/rewrite-maven-plugin/blob/2a2d1f625843bd871b20b6ebc2e21d386159a58b/src/main/java/org/openrewrite/maven/MavenMojoProjectParser.java#L162-L178
At the point where we read project.build.sourceEncoding
we have already parsed the Maven files just before. I don't see any reference to sourceEncoding
in other projects than rewrite-maven-plugin
; that makes me think the pom.xml
is parsed without taking sourceEncoding
into account. What files are affected by the problem you're reporting? Is it the pom.xml
files themselves? Files inside src/
? Or files outside of src/
?
If the problem you're described only affects files outside src/
then this might be a first step towards seeing that resolved:
Would you be able to give that a go and see if that helps you there?
Thanks for your reply @timtebeek !
Let me add how we are currently using the MavenMojoProjectParser: Currently we set skipMavenParsing=true
to skip the built-in POM parsing (otherwise sometimes it hangs, and we have never figured out the reason, which might be related to the Great Firewall), then we set parseAdditionalResources=true
to parse the pom.xml as a regular XML file and then implement own recipes to modify pom.xml. In addition, we have implemented some recipes to modify other configuration files that are outside of the src/
directory. Currently, it appears that these configuration files may also be affected by project.build.sourceEncoding
, which could lead to incorrect encoding.
I have tested #735 and that does solve my issue.
Hi @leveretconey ; did you you have a chance to try out the latest snapshot already? I'm hoping this is now fixed for you, but would like to be sure.
What version of OpenRewrite are you using?
I am using
How are you running OpenRewrite?
pom.xml of the project:
In a legacy application that java source files are using gbk encoding while pom.xml is using utf-8 encoding, OpenRewrite has recognized the file encoding of pom.xml as gbk. This leads to some problem in our scenerio.
After some debugging, I noticed that OpenRewrite reads
project.build.sourceEncoding
property from pom and sets it as the encoding of all files in this project. I think this behavior is not correct because this property is only applicable to source code files by Maven's design.What did you expect to see?
pom.xml should be detected as using utf-8 encoding
What did you see instead?
pom.xml is detected as using gbk encoding
Are you interested in contributing a fix to OpenRewrite?
I won’t contribute to this because the code changes may be huge and involve multiple code repositories.