containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.61k stars 2.4k forks source link

Support Qemu accelerated podman machine on Windows hosts #13006

Open arixmkii opened 2 years ago

arixmkii commented 2 years ago

/kind feature

Description

Support Qemu accelerated podman machine on Windows hosts.

I'm pretty new to podman machine, so, I might be not aware of other external components needed to run podman machine.

I saw that there is support for running podman machine in WSL2 being merged to 4.0.0 RC2. Why this idea still makes sense.

The downside from WSL2 option would be less efficient host resource management, but in some cases it is not of the concern.

What is missing right now is the command line support for choosing between multiple available vendors for podman machine on a single platform, which I don't think is a thing already (at least from my studying of the source tree).

rhatdan commented 2 years ago

I am fine with this, but we need community to work on it. If you want to implement this and it is simply a CLI change, I would be all for it.

afbjorklund commented 2 years ago

@arixmkii

The main problem was the lack of unix sockets, for the Windows build. Then there were a lot of other "little things", to fix.

Here some old code, if you want to continue with it, https://github.com/afbjorklund/podman/commits/machine-windows

Things like missing "xzcat.exe" and missing $HOME, and so on and so forth.

The official Podman support would be for WSL2, so it (Windows) was abandoned.

Like you say, qemu itself should support windows and co-exist with Hyper-V.

arixmkii commented 2 years ago

@afbjorklund Thank you for your input! I have interest to at least try to make it work, but I need to study the code in more details, it will take time.

The official Podman support would be for WSL2

Feel free to close this issue then with this resolution, it could be reopened/recreated later if there is some actual work to base it on.

afbjorklund commented 2 years ago

There was windows support in podman-machine (v1) but it mostly used VirtualBox so the QEMU driver was bare-bones...

It should be possible to use "fifo files" instead of "unix sockets", but it will take some adaption of the go code and qemu flags.

albertdb commented 2 years ago

This would be really nice to have. Currently the only alternative, where WSL2 is not feasible, is using Minikube, which at the current version includes Podman v2.

afbjorklund commented 2 years ago

You could also DIY, using something like Vagrant for the provisioning. But otherwise I think "everyone" is using WSL2 now.

https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/about/supported-guest-os

arixmkii commented 2 years ago

The main problem was the lack of unix sockets

Go has built in support for unix sockets on windows https://go-review.googlesource.com/c/go/+/125456/ which I expect could work just fine for gvproxy and podman binaries.

Mingw added afunix.h last summer (pretty fresh and probably will need to build it from sources to get it into toolchain) https://github.com/mingw-w64/mingw-w64/blob/43e87a27fdb97ae562f7c1e2017c8ce58fef9ee1/mingw-w64-headers/include/afunix.h I believe then Qemu sources could be patched to allow unix sockets in windows build at least behind conditional switch.

Windows support lacks DATAGRAM for unix sockets, but I'm not ready to say if this will be a blocker or not.

missing "xzcat.exe"

Download latest windows build of xzutils https://tukaani.org/xz/ and copy or remane xz.exe to xzcat.exe and it will behave as xzcat, because it checks the command line first argument to choose the behavior. More natural with symlinks, but this way it will work on windows - checked SHA256 of processed qcow2.xz on Win and MacOS hosts.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

arixmkii commented 2 years ago

There is a patch submitted to QEMU for enabling builds with af_unix support on Windows https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg00221.html

arixmkii commented 2 years ago

This also looks useful https://lists.gnu.org/archive/html/qemu-devel/2022-07/msg04098.html as the functionality of file descriptors for unix sockets on windows seems to be limited in Go lang/runtime (probably platform limitations).

It will be needed to replace gvproxy qemu_wrapper pattern from podman.

arixmkii commented 2 years ago

Current progress:

Nginx arm version running on 64 bit Win 11. Works for core and root users. Root version allows to forward privileged ports to host (80 -> 80 for nginx). Screenshot 2022-08-18 182111

Using

Had to switch it to cni as netavark doesn't work for me for some reason. Don't have enough knowledge to troubleshoot netavark issues now, but will look into it later.

arixmkii commented 2 years ago

Noticed that there is some work for Apple virtualization for Apple Silicon in the source tree, but didn't manage to find any code that allows switching between virtualization providers (Apple vs Qemu) for this platform. Wanted to use something ready for my code, but it seems that this part is not ready yet.

rhatdan commented 2 years ago

Yes it is planned for the future.

gbraad commented 2 years ago

@rhatdan Please note that this is not something we can easily deliver as an MSI install, as this relies on a third-party build of Qemu. The whpx support is not a standard build flag, and just like WSL2 relies on Hyper-V. Why not add Hyper-V support instead?

@arixmkii we are currently looking at vfkit as was added to crc. @cfergeau and @baude are working on this.

arixmkii commented 2 years ago

as this relies on a third-party build of Qemu

This is temporary. It is built from master with only single patchset applied (where 1 of 4 patches is already in master, so, I believe it is on its way in).

The whpx support is not a standard build flag

It looks like it is default at least for msys/cygwin builds. I didn't try cross compile. And it is enabled in windows builds advertised on Qemu home page.

WSL2 relies on Hyper-V

It should be possible to use Intel HAXM as well. I have plans for this experiment.

Why not add Hyper-V support instead?

Hyper-V is already supported in a form of WSL2, but it relies on modified kernel and can't run coreos due to some limitations (systemd? cgroups?).
Raw Hyper-V is an interesting topic to investigate, but the point here was to bring as similar as possible "podman machine" experience to all major platforms - Windows, MacOS, Linux (I haven't tested Qemu machine on Linux, but from code perspective it is there).

@gbraad Thank you for the heads-up on vfkit! Nice to see something friendlier than vf API. My interest here was if there is already a CLI support for having multiple providers for the same target and it seems there is nothing yet (not a blocker right now).

gbraad commented 2 years ago

We would have to maintain our own builds of Qemu, just like we do for macos. Else it would be hard to provide 'support' or any form of consistent behaviour. For macos we keep a build in the containers organization. Perhaps same needed for Windows

gbraad commented 2 years ago

HyperV and WSL2 work quite differently. For WSL2 we use a prepared image that is in the podman-WSL2-Fedora subproject, but this is a modified container image. For HyperV we would have a full VM and more control over the network stack and usage, like we do for CRC. More differences exist, like 9P stack, shared network stack, etc

Note: Qemu on Windows, even with WHPX, performs slower.

Note 2: we have used haxm too in tests, but this conflicts with other hypervisors (like having WSL2 enabled). In the end it did not feel much better

arixmkii commented 2 years ago

We would have to maintain our own builds of Qemu

Oh. Didn't know that. I only used homebrew versions. Now I realized that it is formulae and they are doing rebuilds. Didn't check the official installer package.

I believe Google/Android studio is doing something like this with their Qemu based emulator.

HyperV and WSL2 work quite differently.

No doubt here. Added this comment for the completeness of the context in our discussion.

Qemu on Windows, even with WHPX, performs slower

Haven't compared them side by side, but sounds logical. I don't consider this as production load runner, but development assistant workflow mostly. Also, Qemu allows to run and build arm64 containers on amd64 (not for every workload of course).

arixmkii commented 2 years ago

Another nice to have patch set for qemu - 9pfs/virtf windows hosts https://lists.gnu.org/archive/html/qemu-devel/2022-04/msg04075.html

arixmkii commented 2 years ago

Offtopic a bit, but would appreciate any pointers, where I can find more details for netavark and how to troubleshoot it. 😅

arixmkii commented 2 years ago

Collected my changes into single commit https://github.com/arixmkii/podman/commit/cc5c8d990c4088a92e5db79308d7bffd2ffeb2f6

Will try to upstream at least some parts of it, because some of the changes are not bound to this issue.

mheon commented 2 years ago

@arixmkii What kind of details on netavark? Documentation is a little minimal, but that's something we'd like to fix.

arixmkii commented 2 years ago

@mheon I don't know yet 😓 I have these for now Last debug message from netavark

[DEBUG netavark::network::core] Container veth mac:

And then teardown procedure of podman run command resulting in this output

Error: netavark: : EOF

EOF I guess is connection interruption on app crash.

I will compare it with the output on MacOS (where it works just fine) and then will look at the sources and workflow of netavark for the potential culprit.

For now I'm just very satisfied that CNI worked for my experiment 🙂

Update: Discovered that trace level gives out the full command, this is something to play with.

arixmkii commented 2 years ago

Found in journald logs that netavark failed with crash dump due to missing CPU extensions in VM. Switched Qemu CPU to -cpu Skylake-Server-v5 and it resolved the issue with netavark.

Need to figure out why -cpu host is not available on windows. Tried -cpu max, but VM crashed - probably due to my machine specifics.

So, I would say that now I have working Qemu machine on Windows. Plan to create v2 branch, when currently created PRs are merged and hopefully some more registered issues resolved (to make changes even more minimal).

Update: -cpu Skylake-Client-v4 also works. Probably this one is a safer option.

n1hility commented 2 years ago

Qemu on Windows, even with WHPX, performs slower

Haven't compared them side by side, but sounds logical. I don't consider this as production load runner, but development assistant workflow mostly. Also, Qemu allows to run and build arm64 containers on amd64 (not for every workload of course).

In this case, are you referring to qemu-user-static to run mixed arch containers? If so we can support that on any virt backend technology. IIRC it's already shipped in latest FCOS.

Or do you mean using a full system non-virt emulation (e.g. TCG) with qemu? IMO former is better than the latter.

While QEMU support on Win is interesting, I think the ideal long-term solution for a pure virt alternative on Windows is likely Hyper-V. Primarily for the reasons that @gbraad mentioned: it's included out of the box, and it is already well integrated with the host OS.

arixmkii commented 2 years ago

In this case, are you referring to qemu-user-static to run mixed arch containers? If so we can support that on any virt backend technology. IIRC it's already shipped in latest FCOS.

That's great. So any hypervizor running full FCOS will do the trick? Is this correct?

full system non-virt emulation (e.g. TCG)

Having this option (with instruction provided how to acquire needed qcow, adjust command line in json file, etc) might have occasional uses at least with buildah (sometimes foreign architecture just doesn't work for specific apps and with buildah performance can be traded (sometimes)). But this is from "advanced" topics, not something, what should be aimed for (at least for now).

While QEMU support on Win is interesting

I got the experience on Win I expected. And the changeset is not that big, so, I will try to upstream as many parts as possible and potentially setup a separate repo, where I can do rebuilds for Qemu with patches, podman with Qemu support on windows, etc.

While QEMU support on Win is interesting

I though that it makes sense, because there is a possibility to get it working with only minimal changes. Also, it looks like Qemu will still be used for podman machine on Linux and would be kept around at least until Intel Macs are supported (I don't know if there are plans to remove the Darwin support after vfkit version is ready for Apple Silicon Macs).

I think the ideal long-term solution for a pure virt alternative on Windows is likely Hyper-V.

Sounds very reasonable, but this is significantly different effort.

arixmkii commented 2 years ago

Commit representing V2 version: https://github.com/arixmkii/podman/commit/27b93243c019ac437a7e414b574407d99070c6d2

Update 1: And V3 is here, which enables switching provider with EnvVar: https://github.com/arixmkii/podman/commit/cbdc0be2ba89fda033a9382e0ac7937a01c54b49

Update 2: V4 minimization https://github.com/arixmkii/podman/commit/80bccf2861b3bc7a230746f57a42262b1f8862f4

arixmkii commented 2 years ago

Commit with V5: https://github.com/arixmkii/podman/commit/5a68fa54caba6b0563d1f35235d2c0ee1e6dda57

Also created this project QCW (Qemu Containers for Windows) https://github.com/arixmkii/qcw Where I set up workflow to build patched version of all required projects and pack them as a ready to test bundle. Now it includes

Missing features:

arixmkii commented 2 years ago

-cpu Skylake-Client also works. Probably this one is a safer option.

Managed to start QEMU with Hyper-V accel and -cpu max,vmx=off, but then it fails with kernel panic, when init is launched. Will need to study how to deal with/investigate kernel panics and which flags needs to be adjusted. This is on 11th Gen Mobile Intel CPU.

As to -cpu host - it is only supported for KVM and HVF, not with other hypervisors.

arixmkii commented 2 years ago

Managed to use with podman-desktop adding a gocat (modified version) relay from npipe (gocat listening on it) to unix socket (gocat sending to it).

n1hility commented 2 years ago

@arixmkii FYI gvproxy has support for windows pipes and should accept npipe:// URIs in the forwarding params. If it doesn't it should be a simple fix to the parameter parsing, as the backend ssh forwarder supports it, and is what is used by win-sshproxy.exe.

arixmkii commented 2 years ago

Thank you, @n1hility !

Will check it out. I still want to keep unix socket on Windows, because it allows some additional functionality like using curl for communications with API. If gvproxy is capable of forwarding API both to unix socket and npipe (multiple forwardings) or gvproxy or winsshproxy capable of a simple relay - this would work and eliminate the need of additional app. For now I just wanted POC that Podman Desktop works.

A bit offtopic on unix sockets in QEMU on Windows - seems that this patch series are going to be merged soon https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg00041.html

n1hility commented 2 years ago

Thank you, @n1hility !

you’re welcome!

Will check it out. I still want to keep unix socket on Windows, because it allows some additional functionality like using curl for communications with API. If gvproxy is capable of forwarding API both to unix socket and npipe (multiple forwardings) or gvproxy or winsshproxy capable of a simple relay - this would work and eliminate the need of additional app.

Yes it does support multiple forwards and the unix socket, you just repeat the arguments for each connection you want.

For now I just wanted POC that Podman Desktop works.

Sure makes sense.

A bit offtopic on unix sockets in QEMU on Windows - seems that this patch series are going to be merged soon https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg00041.html

That’s great!

arixmkii commented 2 years ago

Updated version could be found here https://github.com/arixmkii/podman/commit/6778b034fc4c7f59bc121e85a4078d59a4c00472

Changes:

Published as zip packaged test bundle under https://github.com/arixmkii/qcw/releases/tag/v0.0.4-alpha

List of updates from QEMU side:

arixmkii commented 2 years ago

9pfs series of patches was resubmitted for latest qemu. Here is the screenshot accessing host FS from container (for now R/O mode only). Also, for now I had to modify config JSON directly to make it work.

Screenshot_20221024_164811

Another update is that unix socket netdev support passed the review and has been queued for inclusion in QEMU 7.2.0

I will try to work on the following topics, when I have time:

Ultimate goal would be to have the first PR draft for this feature, when official QEMU 7.2.0 RC releases appear.

Updated:

Uploaded current test builds: https://github.com/arixmkii/qcw/releases/tag/v0.0.5-alpha

arixmkii commented 2 years ago

https://lists.gnu.org/archive/html/qemu-devel/2022-10/msg05215.html

This feature won't make it into 7.2 release anyway, so patience please. ;-)

We will not see enabled 9pfs on Windows with QEMU 7.2 😞

arixmkii commented 1 year ago

Another update is that unix socket netdev support passed the review and has been queued for inclusion in QEMU 7.2.0

Was merged to master and will be available with 7.2.0. So, the only missing part will be 9pfs on Windows for 7.2.0.

arixmkii commented 1 year ago

Successfully tested my current build against official QEMU 7.2.0-rc0 release from https://qemu.weilnetz.de/w64/2022/

Plan to return to preparing more PRs in 2 weeks. 9pfs experiments on hold for now due to lack of time.

rhatdan commented 1 year ago

@arixmkii any update?

arixmkii commented 1 year ago

@rhatdan I started making additional changes - to allow building installer packages with QEMU support. This is needed before creating PR, so, that interested ones would have straight forward way to check how it works. Hope to come up with a draft PR this weekend, where I could discuss with podman team the bits of changed code, which are not perfectly fit with the current codebase.

arixmkii commented 1 year ago

@rhatdan I created a draft PR for this https://github.com/containers/podman/pull/16872 Marked as draft, because I expect that there will be some amount of comments and questions from development team. Especially about utilizing environment variables to select the machine provider.

arixmkii commented 1 year ago

The very first user friendly release (packaged into installers and more compact) https://github.com/arixmkii/qcw/releases/tag/v0.0.7

It should not be considered production ready, but can be used for testing or evaluation purposes.

P.S. QEMU installer (built with 9pfs on Windows) is now being built in the CI and will be added later.

P.P.S. Installers are unsigned of course, so, they might not work on some machines, where policies for only signed software are enforced.

arixmkii commented 1 year ago

Worked with 9p Windows support for QEMU developers. We traced down 3 bugs in the current implementation. There is a hotfix ready for tests. Will try to publish build with FS mount support soon.

arixmkii commented 1 year ago

Example of QEMU Pomdan machine with p9 on Windows

Creating the machine

podman machine init --image-path testing --username core --cpus 4 --memory 8192 -v C:\Temp\PodmanStorage;/home/core/ps;rw -v C:\Temp\PodmanReadonly;/home/core/pr;ro

Here I modified command line argument parsers to use os.PathListSeparator (';' on Windows platform) to simplify usage for demo purposes, I will create a separate issue to work on the parser, which will be acceptable for the mainlining. 2 directories mounted. One RW to /home/core/ps and another RO to /home/core/pr (now I understand that this naming is bad for the demo, but let it stay)

Then start machine as normal (I did with debug logs just in case, but this should not matter)

Then launching interactive terminal inside container with both mounts

podman run -it --rm -v /home/core/ps:/mnt/write -v /home/core/pr:/mnt/read busybox

Work with readonly FS

### Using RO FS

~ # cd /mnt/read/
/mnt/read # cat ReadMe.txt
RO text
/mnt/read # echo "added" > ReadMe.txt
sh: can't create ReadMe.txt: Read-only file system
/mnt/read # echo "added" > Other.txt
sh: can't create Other.txt: Read-only file system

And the with read-write FS

### Using RW FS

/mnt/read # cd /mnt/write/
/mnt/write # cat ModifyMe.txt
sample
/mnt/write # echo "new sample" >> ModifyMe.txt
/mnt/write # cat ModifyMe.txt
sample
new sample
### This is actually a bug
/mnt/write # echo "only sample" > ModifyMe.txt
sh: can't create ModifyMe.txt: Invalid argument
/mnt/write # echo "other sample" > Other.txt
/mnt/write # cat Other.txt
other sample
/mnt/write # rm Other.txt
/mnt/write # cat Other.txt
cat: can't open 'Other.txt': No such file or directory
/mnt/write # mkdir inner
/mnt/write # ls -l
total 0
-rw-rw-rw-    1 nobody   nobody          18 Jan 12 20:55 ModifyMe.txt
drwxrwxrwx    1 nobody   nobody           0 Jan 12 20:59 inner
/mnt/write # ls -l inner
total 0

There discovered bug with the permissions will be reported to people working on QEMU side. There are more known issues with the current implementation:

Summary would, that in general it works almost like on macOS. Ok, one can't really use -v "$HOME:$HOME" for the machine and then magical -v "$PWD:$PWD" for the container, when working directory is under user home, but this was always like "hiding the actual complexity" thing, not a real feature (though looking like the one).

Will prepare and publish demo builds (both QEMU and Podman) in the coming days.

arixmkii commented 1 year ago

Finally managed to publish pre-built packages https://github.com/arixmkii/qcw/releases/tag/v0.0.8 and updated README file to reflect recent changes.

arixmkii commented 1 year ago

I would say that now we have all the bricks ready and just need to put them together.

More in the pipeline: