Closed Great-Stone closed 4 months ago
Hi @Great-Stone! I'm pretty sure the error "function not implemented" is coming from the kernel. You could verify this by running under strace
(or similar) and checking to see if you get that error from one of the syscalls. When you built on Alpine under UTM, you may have a different kernel build than is available on the target. Given that it's happening from fork/exec, I'd hypothesize that it's a missing cgroup-related kernel configuration option.
If you're building the kernel for the 5G router, you may want to diff the configuration between the router and the UTM VM to see if you can spot the missing component. If you're not building the kernel for the 5G router and/or can't update it for whatever reason, you unfortunately may be out of luck here without creating your own task driver that avoids anything cgroup related (or whatever the missing piece is).
Hi @tgross, Thank you for letting us know about the problem. Could you please confirm the required condition related to cgroups? (e.g. specific package elements or settings) I would like to inform this to the person responsible for building the OS.
@Great-Stone as I noted above, a missing cgroups configuration value is only a hypothesis. You need to run Nomad under strace
(or similar) to determine what the failing syscall is in order to further diagnose the problem.
Leave updates.
The specific specs of the machine were OpenWrt 15.05.1
, so I tried again, but it didn't work. Presumably this is a Linux environment thing, as it doesn't work on older versions of RHEL.
mxrroot@5Gax:/# cat /etc/openwrt_version
15.05.1
mxrroot@5Gax:/# cat /etc/openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='Chaos Calmer'
DISTRIB_REVISION='ca0ec2bfb0ef+r49254'
DISTRIB_CODENAME='chaos_calmer'
DISTRIB_TARGET='ipq/ipq807x_64'
DISTRIB_DESCRIPTION='OpenWrt Chaos Calmer 15.05.1'
DISTRIB_TAINTS='no-all busybox override'
Since OpenWrt 18.06.0
, it works fine.
Ok, so this isn't Alpine Linux, it's OpenWrt. They both use musl libc but that doesn't make them the same distribution. OpenWrt uses buildroot. So whomever is configuring the kconfig for buildroot should be able to verify whether there's a missing kernel configuration value (possibly but not necessarily for cgroups), and the operator should be able to verify specifically what's happening with strace
. This isn't a well-supported configuration, and not present in our official release builds, so without the user doing more diagnostics as I've described here on their end, there's very little we can do to help.
Nomad version
Nomad v1.7.7-dev BuildDate 2024-04-02T19:24:51Z Revision cf25cf5cd5dbade28a38fc41398fc8b584f3b643
Operating system and Environment details
Alpine Linux for 5G router
Issue
The Nomad runs fine, but the raw_exec runs fail. Since there are no binaries for Alpine Linux, I built this and tested its execution. The 5G router equipment is running a lighter version of Linux, albeit based on Alpine Linux. The machine has very little storage (512MB), so building on it was not possible. I installed Alpine Linux arm64 with UTM on my MacBook (M2) and proceeded to build from there. Here is the procedure for the build (I referenced the build from the nomad-enterprise git repo).
Reproduction steps
The command we built looks like this
Expected Result
On Alpine Linux on MacOS M2, where we ran the build, both Nomad execution and raw_exec Job execution are fine. However, on Linux on the 5G router device, running raw_exec causes an error. I would like to know if I am missing something essentially required for that Linux.
When I browse the logs for related errors, I see a lot of cgroups, so I update the related list first.
cgroups
dirActual Result
Job file (if appropriate)
Nomad Server logs (if appropriate)
Nomad Client logs (if appropriate)
Running Log
Error Log