Open Schaka opened 1 month ago
Did you try with -J
flag to native-image
?
-J
pass directly to the JVM running the image generator
So if you're setting additional args like ...
"BP_NATIVE_IMAGE_BUILD_ARGUMENTS" to """ -march=compatibility -H:+AddAllCharsets -Dsun.jnu.encoding=UTF-8 -Dfile.encoding=UTF-8
Then try -J-Dsun.jnu.encoding=UTF-8
and -J-Dfile.encoding=UTF-8
(instead of those args without the -J
). That should pass those args through to the build time. I see some evidence of that helping similar issues here.
In general, what I suggest with something like this is to get it working without buildpacks. So make it build and work when calling native-image
directly (or with their gradle tools) on your machine. When it's working well that way, look at the flags you had to pass to native-image and then update BP_NATIVE_IMAGE_BUILD_ARGUMENTS
according.
The buildpack is just installing & running native-image
for you. It attempts to add some basic arguments that you'll need, but beyond that, it's up to you to pass additional arguments through.
I did test it without buildpacks about an hour ago and made a small sample project. I was about to update the issue and figured it may be buildpack related. I've also considered it may be a problem with Adoptium or Bellsoft.
Here's a small test project: graalvm-test.zip
If you replace the base image in the Dockerfile with 21 instead of 23, you can start it with any combination of parameters. They end up not mattering and you always get the same error. With 23 it just works and you don't have to set encoding parameters at all.
I've given the -J
flag a try and had no success.
I've managed to get `-Dsun.jnu.encoding=UTF-8" and read it from within the image as such by adjusting the CMD (according to docs) but the actual value seemed to get completely ignored.
This doesn't seem like something that's specific to buildpacks, if I'm following your tests here. If you can reproduce it using the standard Dockerfiles, that's a behavior with native-image itself.
All I can suggest is that we do have Java 23 available in buildpacks, https://github.com/paketo-buildpacks/bellsoft-liberica/releases/tag/v10.9.0, has it Java 23, and that was pulled into https://github.com/paketo-buildpacks/java/releases/tag/v16.1.0 last week's release. So if using Java 23 works with your Dockerfile sample, I'd bet it works with buildpacks too.
So I had already been using Java 23 for a while. This is my log output in that regard:
$JAVA_TOOL_OPTIONS the JVM launch flags
[creator] Using Java version 23 from BP_JVM_VERSION
[creator] BellSoft Liberica NIK 23.0.0: Contributing to layer
I now added "paketobuildpacks/oracle"
. I was hoping there was a way to use Oracle's GraalVM as a base image directly and this seems to be it.
Unfortunately, the result is still the same. The resulting image cannot use Path.of("/umläut")
without an exception being thrown.
If I can recreate it in a sample project specifically built around paketo's buildpacks, will you look into it?
What we'd need to look into this is something like your sample project from before that works when you build with a Dockerfile or just on the local machine, but does not work when building the same source code with buildpacks. If you have a sample like that, I can take a look.
If you do gradle bootBuildImage
and docker run native-image-error
, you'll see the error.
It will not error if you just run it locally, only in the image produced by buildpacks.
You can even try docker run native-image-error -Dsun.jnu.encoding=UTF-8
.
I think it's some form of base image that doesn't accept or delegate the LANG
or LC_ALL
env variables or the base image is one where GraalVM doesn't interpret them correctly at build time.
As you can see in my previous example, it works just fine using Oracle's base image with javac
and native-image
. I'm at a bit of a loss trying to figure out what's different.
You guys do amazing work, so I wouldn't be surprised if I fucked up somehow.
A couple of observations...
ghcr.io/graalvm/graalvm-community:21 doesn't seem to have locales installed correctly, bash in that image can't even display a filename with an umlaut correctly. Conversely ghcr.io/graalvm/graalvm-community:23 does and bash works correctly there. I don't think this is a difference in native-image, just the container image.
https://github.com/oracle/graal/issues/9504 so JAVA_TOOL_OPTIONS
won't work when you run your image, a native image binary doesn't look at this env variable. A native-image binary does support reading system properties from arguments to the application though, so that's why it picks them up when you run your application image passing the system props as args to the binary.
As far as I can tell the Paketo base/run images have locale set up correctly. At least with the base & full images, I can exec into the container, run locale
and see that it's reporting en_US.UTF-8
. I can use UTF-8 codes in file names with bash and everything seems to work OK there. It also has been reported to work with a JVM and other language runtimes. It is just with the native-image app where I have trouble with Path.of
. If we are missing a package to make this work, let me know and we can look at adding it. I would just need to know the Ubuntu package to install (or possibly more details on what GraalVM needs, and I could try to find out what Ubuntu package provides that).
I tried with Oracle GraalVM 21 & 23. Same results. No difference there.
I've also created an issue over at at GraalVM. The package they use seems to be glibc-all-langpacks
.
I can't find the Ubuntu/Debian equivalent. Maybe they can shine a light on it.
The locales
package is available on Ubuntu 22, but that should be installed by default.
Having a quick look, at most I can tell that Oracle Linux 9 supplies glibc-all-langpacks 2.34, whereas Ubuntu's locale uses 2.35. Hopefully that one minor version isn't what breaks it here.
Edit: Out of curiosity, I've created a builder from a build and run image using graalvm-community:23, specifically added the glibc-all-langpacks
and it still results in the same error. I'm guessing I'd have to adjust the buildpacks too.
I ended up using --patch-module
. Recompiling the sun.nio.fs.Util
class myself with UTF-8 forced, doing a hacky copying into a folder structure that's accepted by the compiler and it got accepted by the native image without issue.
It's incredibly hacky but may solve this for anyone else that needs a temporary (haha) solution. The real problem still needs fixing but I suspect it'll be a while before the GraalVM team gives a definitive answer as to what it actually required and the system property is set under the hood.
Thanks for sharing! I'm subscribed on the GraalVM issue, so I'll keep an eye on what they say and if there's anything we can do to make this just work, I'll open up issues.
I hope I'm in the right place and this isn't directly related to GraalVM. So please excuse me if I'm wasting yourn time. You can find all the code I'm talking about right here: https://github.com/Schaka/janitorr/tree/bazarr-support
The image is built using the Spring-Boot bootImage step via Gradle and I'm passing these ENV variables.
My host (Debian 12) has LANG set correctly and LC_ALL not set at all. According to the docs, I also passed these arguments to Docker via compose.yml
According to the docs, this would not print correctly to console (docker logs) otherwise, but definitely seems to. Granted, I use logback and not any direct prints, so there is a chance this fixes things magically.
Yet, the second I use Path.of("a path with an ümläüt"), I run into the following exception:
Is there something I'm missing here, or could this be a bug in GraalVM somehow? Looking at the code, UnixFileSystem definitely reads
sun.jnu.encoding
. The filepath is received as a valid UTF-8 string via REST.Logging from within the image provides: