dbhi / qus

qemu-user-static (qus) and containers, non-invasive minimal working setups
https://dbhi.github.io/qus
Other
327 stars 17 forks source link

Better documentation on how does it work #25

Open ElDavoo opened 5 months ago

ElDavoo commented 5 months ago

Hello, I'm trying to understand how the mechanism works, but I'm stuck.
My ultimate goal is to use box64 instead of qemu to get more performance on my Raspi 4, so I'd like to understand how does qus register qemu, but I can't understand one thing.

root@host:/# cat /proc/sys/fs/binfmt_misc/qemu-x86_64 
enabled
interpreter /qus/bin/qemu-x86_64-static
flags: F
offset 0
magic 7f454c4602010100000000000000000002003e00
mask fffffffffffefe00fffffffffffffffffeffffff

Where is the binary?
1) It's not in the host:

root@host:/# ls -l /qus
ls: cannot access '/qus': No such file or directory

2) It's not in the final emulated container:

pi@host:~ $ DOCKER_DEFAULT_PLATFORM=amd64 docker run -it --rm ubuntu ls -l /qus
/usr/bin/ls: cannot access '/qus': No such file or directory

docker image ls -a does not show anything related to qus.
I found the binary in docker internal folders:

root@host:/# find -mount -type f -name qemu-x86_64-static
./var/lib/docker/volumes/ff09c47b39161ce62a81f38fd8fc0ff10da5abf09b81dfeb82245d368c433b90/_data/bin/qemu-x86_64-static
./var/lib/docker/volumes/1329b9dcf2cf270f77b42c5a9360625e6babbca52cdba51ddac87d17c3a670c1/_data/bin/qemu-x86_64-static
./var/lib/docker/overlay2/16bd99024d081358190427da12fa97740fc5890b8b600ac24bea3c873da94ab3/diff/qus/bin/qemu-x86_64-static

I tried creating a container and inspecting, but no volumes are used and i couldn't see anything relevant.

So I do not understand: How does docker "apply" the /qus/bin/qemu-x86_64-static in the way that the binfmt_misc system can read from that path, while not apparently appearing anywhere? Because (supposedly) the kernel does not know anything about docker, so it should just try to read it from the host path (?). Or, if it tries to read it from the namespaced path, than the emulator should exist inside the container.

What am I missing?

(how can I make docker use my binary (box64) to transparently run x86_64 container on arm64 like qus does?).

umarcor commented 5 months ago

What am I missing?

The point you are missing is you can register interpreters with binfmt_misc by having them loaded into memory, using -F (or -p, persistent, when using QEMU's registration script). In that case, the binary is not required to exist later. So, there is a binary available when the interpreter is registered, but then any other binary running on the host or within any other container can have the instructions translated without access to a binary on the filesystem, because the functionality of such binary is available in memory until the next system restart.

This repository showcases multiples ways to register QEMU interpreters, most of them do use the -p (persistent) option. See section Tests of the documentation; see also https://dbhi.github.io/qus/faq.html#do-i-need-to-install-qemu-static-on-the-host-even-though-it-is-only-needed-by-the-containers. As you can see, examples f, F, c, C, v, V, r, R, s, S, h and H do register the interpreter using a binay available on the host, while examples i, I, d and D register the interpreter using a binary available on the container. So, when your interpreter is shown as interpreter /qus/bin/qemu-x86_64-static that means it was registered using a "qus" container and when it was run the binary existed in that location within that container; it also means it was registered persistently because, as you already guessed, that binary does not exist on the host and the container where it existed was removed.

Find further references in https://dbhi.github.io/qus/references.html, e.g. https://lwn.net/Articles/679308/ and https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=948b701a607f123df92ed29084413e5dd8cda2ed.

My ultimate goal is to use box64 instead of qemu to get more performance on my Raspi 4, so I'd like to understand how does qus register qemu

See https://github.com/qemu/qemu/blob/master/scripts/qemu-binfmt-conf.sh. That's the script used to register QEMU interpreters through binfmt_misc. In essence, it's just writting a magic string to /proc/sys/fs/binfmt_misc/register (see https://github.com/qemu/qemu/blob/master/scripts/qemu-binfmt-conf.sh#L273-L291). Note that this repo does not use the upstream shell script, but a variant: https://github.com/umarcor/qemu/blob/series-qemu-binfmt-conf/scripts/qemu-binfmt-conf.sh; yet that should not be relevant for your goal.