Elektrobit / flake-pilot

Registration/Control utility for applications launched through a runtime-engine, e.g containers
MIT License
9 stars 5 forks source link

VSock communication on firecracker side in sci #106

Closed m-kat closed 1 year ago

m-kat commented 1 year ago

Implements guest communication for #87

m-kat commented 1 year ago

@schaefi This implements sci communication, it took me some time but I found this VSOCK communication library which simplifies the whole thing a lot. When it comes to the pilot side we need to agree on one when the command is running in resume mode, are we daemon-ising the firecracker instance or running in the background in the current shell?

schaefi commented 1 year ago

@schaefi This implements sci communication, it took me some time but I found this VSOCK communication library which simplifies the whole thing a lot.

yes looks great, thanks. Something to double check, does the static binary still work ? I'm asking because sometimes new dependencies causes the static build to compile but at runtime an opcode or symbol is missing.

When it comes to the pilot side we need to agree on one when the command is running in resume mode, are we daemon-ising the firecracker instance or running in the background in the current shell?

I would daemon-ising it

m-kat commented 1 year ago

The static linking is still working not only linking

schaefi commented 1 year ago

@m-kat so I wanted to give this change a try and did the following setup:

flake-ctl firecracker pull --name leap --kis-image https://download.opensuse.org/repositories/home:/marcus.schaefer:/delta_containers/images/firecracker-basesystem.x86_64.tar.xz
flake-ctl firecracker register --vm leap --app /home/ms/mybash --target /bin/bash --overlay-size 20GiB

Next I compiled the new sci binary with your changes and put it into the rootfs

cd /var/lib/firecracker/images/leap 
mount rootfs /mnt
cp sci /mnt/usr/sbin/sci
umount /mnt

Next I edited the flake: vi /usr/share/flakes/mybash.yaml and added:

boot_args:
        ...
        - sci_resume=1

Next I called /home/ms/mybash and got the following output (with export PILOT_DEBUG=1) I stripped not relevant data

....
{"vcpu_count":2,"mem_size_mib":4096},"vsock":{"guest_cid":3,"uds_path":"/run/sci_cmd_25371.sock"}}

...
[2023-06-12T21:03:30Z DEBUG sci::defaults] Binding guest CID 2 on port 52
...

Please note, the "Binding guest..." message I have added to your code to get some info prior any connection. The system is now in the loop waiting at:

match VsockListener::bind_with_cid_port(vsock::VMADDR_CID_HOST, defaults::VM_PORT) {
    Ok(listener)=>{
        // main loop
        loop {
            match listener.accept() {

So far so good. Now I wanted to test the connection and on my host I called

sudo socat - UNIX-CONNECT:/run/sci_cmd_25371.sock
CONNECT 52

But nothing happens. I assume you get this working somehow and I'm running out of ideas what could be wrong ?

Thanks much

m-kat commented 1 year ago

@schaefi I'm curently looking into it, it seems that the vhost kernel module is not always loaded, will let you know when I solve it.

m-kat commented 1 year ago

@schaefi I have fixed some issues in the last commit and made additional PR for the fix with CID, when both PR's will be merged your test will work.