mageddo / dns-proxy-server

Solve your DNS hosts from your docker containers, then from your local configuration, then from internet
http://mageddo.github.io/dns-proxy-server/
Apache License 2.0
764 stars 76 forks source link

DPS binary doesn't work on non 4k page size linux arm64 kernels #473

Closed pschiffe closed 1 day ago

pschiffe commented 3 weeks ago

What is Happening

DPS binary, distributed either as docker image or from dns-proxy-server-linux-aarch64-3.19.3-snapshot.tgz archive, doesn't work on non 4k page size linux arm kernels.

RHEL 8 variants for arm64 has only 64k page size kernel:

$ getconf PAGESIZE
65536

RHEL 9 variants for arm64 have 2 kernel versions (4k and 64k page size) and you can choose. DPS works on the 4k version, but not on the 64k version.

I didn't tested it, but this will probably be issue also when running on linux on apple's arm, as that kernel has 16k page size.

When you try to run the binary from release archive of from docker, it won't start, you just get the error: Fatal error: Failed to create the main Isolate. (code 8)

When I try to run the DPS as jar, it works:

$ java -version
openjdk version "22.0.1" 2024-04-16
OpenJDK Runtime Environment (Red_Hat-22.0.1.0.8-1) (build 22.0.1+8)
OpenJDK 64-Bit Server VM (Red_Hat-22.0.1.0.8-1) (build 22.0.1+8, mixed mode)

$ java -jar ./dns-proxy-server.jar 
20:42:31.229 [main           ] INF c.m.dnsproxyserver.config.dataprovider.JsonConfigsl=75   m=createDefault                   status=createdDefaultConfigFile, path=/root/conf/config.json
20:42:31.308 [main           ] DEB c.m.d.config.dataprovider.ConfigDAOJson           l=39   m=find                            configPath=/root/conf/config.json
20:42:31.657 [main           ] INF c.m.d.s.d.a.DpsDockerEnvironmentSetupService      l=32   m=setup                           status=binding-docker-events, connectedToDocker=true
20:42:31.657 [main           ] INF c.m.d.s.d.a.DpsDockerEnvironmentSetupService      l=44   m=setupNetwork                    status=dpsNetwork, active=false
20:42:31.657 [main           ] INF c.m.d.s.docker.application.DpsContainerService    l=102  m=tRunningContainersToDpsNetwork  status=autoConnectDpsNetworkDisabled, dpsNetwork=false, dpsNetworkAutoConnect=false
20:42:31.657 [main           ] INF c.m.d.solver.docker.entrypoint.EventListener      l=32   m=onStart                         status=containerAutoConnectToDpsNetworkDisabled
20:42:31.660 [main           ] INF com.mageddo.dnsserver.UDPServerPool               l=31   m=start                           Starting UDP server, addresses=/0.0.0.0:53
....

I think this is some java issue, but I'm not sure how to fix it. Google shows some reports like this for other java sw.

Specs

Server: Docker Engine - Community Engine: Version: 26.1.3 API version: 1.45 (minimum version 1.24) Go version: go1.21.10 Git commit: 8e96db1 Built: Thu May 16 08:33:12 2024 OS/Arch: linux/arm64 Experimental: true containerd: Version: 1.6.32 GitCommit: 8b3b7ca2e5ce38e8f31a34f35b2b68ceb8470d89 runc: Version: 1.1.12 GitCommit: v1.1.12-0-g51d5e94 docker-init: Version: 0.19.0 GitCommit: de40ad0

* DPS Version: `defreitas/dns-proxy-server:3.19.3-snapshot-aarch64`
* Attach DPS Log file

Fatal error: Failed to create the main Isolate. (code 8)


* OS: `Rocky Linux 8.10 (Green Obsidian)`
mageddo commented 2 weeks ago

Hey @pschiffe thanks for your report. I have no idea of how to fix it, I will search how to improve on that, any help is welcome.

It is probably related to the docker file

https://github.com/mageddo/dns-proxy-server/blob/2dd529010496f7ae75cfade2dcbe5aa9b9d812ad/Dockerfile.builder.linux-aarch64#L1

Or the deploy specs

https://github.com/mageddo/dns-proxy-server/blob/2dd529010496f7ae75cfade2dcbe5aa9b9d812ad/.github/workflows/actions-deploy.yml#L69

mageddo commented 2 weeks ago

Maybe it's related to this one https://github.com/oracle/graal/issues/7513, Looks like I will have to upgrade qemu version, maybe graal version too.

mageddo commented 1 week ago

Hey, @pschiffe , I've upgraded GraalVM and Qemu versions, can you check if dps 3.22.0-snapshot fix your usecase?

pschiffe commented 6 days ago

Hi @mageddo, so far no joy, though the error code is little bit different (and .jar stopped working):

# cat /etc/os-release | grep PRETTY
PRETTY_NAME="Rocky Linux 8.10 (Green Obsidian)"

# getconf PAGESIZE
65536

# uname -r
4.18.0-553.5.1.el8_10.aarch64

# docker version | grep -A 2 Server
Server: Docker Engine - Community
 Engine:
  Version:          26.1.3

# docker run -d defreitas/dns-proxy-server:3.22.0-snapshot-aarch64
# docker logs d6928c2b6735
Fatal error: Failed to create the main Isolate. (code 24)

When trying binary dns-proxy-server-linux-aarch64-3.22.0-snapshot.tgz:

# ./dns-proxy-server 
./dns-proxy-server: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by ./dns-proxy-server)
./dns-proxy-server: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by ./dns-proxy-server)

# rpm -q glibc
glibc-2.28-251.el8_10.2.aarch64

Trying .jar file:

# java -version
openjdk version "21.0.3" 2024-04-16 LTS
OpenJDK Runtime Environment (Red_Hat-21.0.3.0.9-1) (build 21.0.3+9-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-21.0.3.0.9-1) (build 21.0.3+9-LTS, mixed mode, sharing)

# java -jar ./dns-proxy-server.jar 
Exception in thread "main" java.lang.NoClassDefFoundError: org/graalvm/nativeimage/ImageInfo
    at com.mageddo.utils.Runtime.getRunningDir(Runtime.java:34)
    at com.mageddo.dnsproxyserver.config.dataprovider.ConfigDAOJson.buildConfigPath(ConfigDAOJson.java:59)
    at com.mageddo.dnsproxyserver.config.dataprovider.ConfigDAOJson.find(ConfigDAOJson.java:33)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
    at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:510)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:575)
    at java.base/java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
    at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:616)
    at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:622)
    at java.base/java.util.stream.ReferencePipeline.toList(ReferencePipeline.java:627)
    at com.mageddo.dnsproxyserver.config.application.ConfigService.findConfigs(ConfigService.java:35)
    at com.mageddo.dnsproxyserver.config.application.ConfigService.findCurrentConfig(ConfigService.java:29)
    at com.mageddo.dnsproxyserver.config.application.Configs.lambda$getInstance$0(Configs.java:19)
    at com.mageddo.commons.lang.Singletons.lambda$createOrGet$0(Singletons.java:19)
    at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
    at com.mageddo.commons.lang.Singletons.createOrGet(Singletons.java:19)
    at com.mageddo.commons.lang.Singletons.createOrGet(Singletons.java:15)
    at com.mageddo.dnsproxyserver.config.application.Configs.getInstance(Configs.java:18)
    at com.mageddo.dnsproxyserver.App.findConfig(App.java:53)
    at com.mageddo.dnsproxyserver.App.start(App.java:36)
    at com.mageddo.dnsproxyserver.App.main(App.java:25)
Caused by: java.lang.ClassNotFoundException: org.graalvm.nativeimage.ImageInfo
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:526)
    ... 23 more
mageddo commented 6 days ago

alright, I will have to try it out

mageddo commented 2 days ago

Hey @pschiffe , Can you check if DPS 3.24.0-snapshot fixes your usecase? The jar also should be working again.

pschiffe commented 1 day ago

Hi @mageddo, awesome job! This seems to be fixed in all my use cases. Here's what I've tested (with 3.24.0-snapshot):

It's perfect, thank you!

Reg glibc version, it's usually not possible to upgrade the version within the distribution. RHEL 8 & derivates are currently running on glibc-2.28, and these will be supported until 2029. Anything else supported is probably running on newer versions, so you don't need to support older versions than that. For now, supporting glibc-2.28 and newer is perfect.

This issue can be closed from my side, thanks again.

mageddo commented 1 day ago

Cheers, thanks for your help.