crc-org / crc

CRC is a tool to help you run containers. It manages a local OpenShift 4.x cluster, Microshift or a Podman VM optimized for testing and development purposes
https://crc.dev
Apache License 2.0
1.26k stars 241 forks source link

[Spike] Can crc use krunkit? #4233

Closed cfergeau closed 2 months ago

cfergeau commented 5 months ago

krunkit is a drop-in replacement for vfkit from a cmdline argument point of view. podman-machine can make use of it, see https://docs.google.com/document/d/1IZCWAY5zMHqd0YlbnpGtCe7HNeWKQNHi8RuhujAJmg0/edit for some details.

Since it has additional features compared to vfkit, it would be interesting to know if crc can make use of it.

In order to test krunkit + crc, a few steps that come to mind:

cfergeau commented 5 months ago

One initial issue is https://github.com/containers/krunkit/issues/8 - krunkit is currently only available on Apple Silicon machines, it's not available for Intel-based macs.

vyasgun commented 3 months ago

krunkit does not accept certain arguments such as --kernel and --kernel-cmdline which are currently being used by crc to start a vfkit machine. These arguments can be removed if the boot mode is changed to UEFI (The issue: https://github.com/crc-org/crc/issues/4180). Addressing this first.

praveenkumar commented 3 months ago

@vyasgun but at least it is tried without those options?

vyasgun commented 3 months ago

@praveenkumar I appreciate the initiative to create the PR for using UEFI with vfkit. Running the VM without those options could have only be tried with the said code changes. Another flag that needs to be removed for krunkit VM is --timesync.

I have tried using the new options with krunkit and there has been progress. The VM process is running but there is some issue with the virtuo-net device.

If I am correct, according to the code, the device is only being added to vfkit when system mode networking is used:

Can you confirm this? And its relevance to the networking modes? (Please note, I have added this by changing my personal fork of the codebase here: https://github.com/vyasgun/crc/tree/spike/uefi but I have some questions/clarifications I need)

Apologies if the question is too naive but there's not much documentation to follow :)

praveenkumar commented 3 months ago

If I am correct, according to the code, the device is only being added to vfkit when system mode networking is used:

you can remove virtio-net option because we are not allowing system-mode networking for mac and it is not even tested.

Another flag that needs to be removed for krunkit VM is --timesync.

This needs some more digging to provide a better answer, but for time being (for poc) if something work without it that should be a progress (also check how podman-machine handle time sync).

With all those changes are you able to run the VM with krunkit and provision cluster (microshift/openshift)? If yes, does it have advantage over vfkit (in terms of performance)?

gbraad commented 3 months ago

tmesync was due to a problem with the sleep/idle state of the VM. it might need some more investigation in general to determine if this time skewing still happens. In conclusion; leave this out for now; will need a new issue.

vyasgun commented 3 months ago

This needs some more digging to provide a better answer, but for time being (for poc) if something work without it that should be a progress (also check how podman-machine handle time sync).

podman-machine is not using --timesync in both vfkit and krunkit. A little more digging into is required. However, virtio-net is being used and it would be helpful for me to understand a slightly more detailed explanation on its relevance in our usecase.

Yes, I can get the krunkit process running. The command being used:

podmanqe@dev-platform-mac4 ~ % /opt/homebrew/bin/krunkit --cpus 2 --memory 4096 --bootloader efi,variable-store=/Users/podmanqe/.crc/machines/crc/efistore.nvram,create --device virtio-fs,sharedDir=/Users/podmanqe,mountTag=dir0 --device virtio-rng --device virtio-blk,path=/Users/podmanqe/.crc/machines/crc/crc.img --device virtio-vsock,port=1024,socketURL=/Users/podmanqe/.crc/tap.sock,listen --restful-uri tcp://localhost:8080

podmanqe@dev-platform-mac4 ~ % curl 127.0.0.1:8080  --output -
{"state": "VirtualMachineStateRunning"}%

you can remove virtio-net option because we are not allowing system-mode networking for mac and it is not even tested.

krunkit goes to the api login page to manually enter the password. I just want to be sure if not using virtio-net might be affecting this.

Image

praveenkumar commented 3 months ago

krunkit goes to the api login page to manually enter the password. I just want to be sure if not using virtio-net might be affecting this.

This is when you are trying to run it directly using cli command, does it work when you change the crc code base and use krunkit binary instead vfkit? I think with cli it is expected since no ssh key is passed.

vyasgun commented 3 months ago

This is when you are trying to run it directly using cli command, does it work when you change the crc code base and use krunkit binary instead vfkit? I think with cli it is expected since no ssh key is passed.

No, it doesn't seamlessly run through CRC code base as of now which is why I am trying to figure out the required options. Except this part, the machine is in running state as mentioned in my previous comment. Either the ssh settings or ignition config. podman-machine logs in directly (it is using podman-machine-default-ignition.sock) as during its startup, a certain set of commands is executed.

Can you still point me to the use of virtio-net and why is it only used for system mode networking? It will be helpful for me. Thanks :)

praveenkumar commented 3 months ago

Can you still point me to the use of virtio-net and why is it only used for system mode networking?

Before migrating to vfkit we used to use the hyperkit ( https://github.com/moby/hyperkit ) as driver and it was using virtio-net but that didn't provide us way to effectively handle the vpn connections so we went with https://github.com/containers/gvisor-tap-vsock (user-mode networking) and have support for both but slowly made this as default networking solution by obsoleting virtio-net and we are not even testing it any more.

More info around virtio-net : https://www.redhat.com/en/blog/introduction-virtio-networking-and-vhost-net

praveenkumar commented 3 months ago

No, it doesn't seamlessly run through CRC code base as of now which is why I am trying to figure out the required options.

To me, this machine is booted and sshd service should be running I am more interested in now if you just rename the krunkit to vfkit and try crc start --log-level debug what issue you get as error.

vyasgun commented 3 months ago

I was able to bring up the crc VM using the following changes: https://github.com/vyasgun/crc/commit/6eafcf67f21507c8c45395f74c94c6d00b8f7491 (Please note it's just a POC with some hardcode just for testing purposes)

Verifying it's using krunkit:

podmanqe@dev-platform-mac4 ~ % crc config view
- consent-telemetry                     : no
- cpus                                  : 4
- memory                                : 16384
- preset                                : microshift
- skip-check-vfkit-installed            : true

podmanqe@dev-platform-mac4 crc % crcssh
Warning: Permanently added '[127.0.0.1]:2222' (ED25519) to the list of known hosts.
Script '01_update_platforms_check.sh' FAILURE (exit code '1'). Continuing...
Boot Status is GREEN - Health Check SUCCESS
[core@api ~]$ ls /dev/dri
by-path  card0  renderD128

I also ran an InstructLab pod on CRC with the following spec and made it run some prompts by using an interactive terminal ( kubectl exec -ti mistral-pod -- bash ). The prompts are working but the responses are very slow compared to podman-machine using krunkit.

podmanqe@dev-platform-mac4 gunjan % cat mistral-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: mistral-pod
spec:
  containers:
  - image: quay.io/slopezpa/fedora-vgpu-llama
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 300; done;" ]
    name: mistral-pod
    volumeMounts:
    - mountPath: /dev/dri
      name: dev-dri
    - mountPath: /models
      name: downloads
  dnsPolicy: ClusterFirst
  restartPolicy: Always
  volumes:
  - name: dev-dri
    hostPath:
      path: /dev/dri
  - name: downloads
    hostPath:
      path: /Users/podmanqe/Downloads

However, crc status doesn't show the VM as running so the proper changes for everything to work in sync need to be looked into even though it can be ssh'd into.

podmanqe@dev-platform-mac4 ~ % crc status
CRC VM:                  Stopped
MicroShift:              Stopped (v4.16.4)
RAM Usage:               0B of 0B
Disk Usage:              0B of 0B (Inside the CRC VM)
Persistent Volume Usage: 0B of 0B (Allocated)
Cache Usage:             67.12GB
Cache Directory:         /Users/podmanqe/.crc/cache

Conclusion:

According to the spike, CRC can use krunkit. The next steps depend on if we want to simply replace vfkit with krunkit in our code or we want to support it along with vfkit. The code changes seem straightforward.

cfergeau commented 2 months ago

podman-machine is not using --timesync in both vfkit and krunkit. A little more digging into is required.

They are using https://chrony-project.org/doc/4.5/chrony.conf.html#makestep instead: https://github.com/containers/podman-machine-os/blob/main/podman-image-daily/50-podman-makestep.conf

cfergeau commented 2 months ago

I also ran an InstructLab pod on CRC with the following spec

Did you use the same yaml with podman-machine for comparison? For a start, you could ssh into the crc krunkit VM, and run an AI workload by directly using podman ...

vyasgun commented 2 months ago

@cfergeau Yes, it's the same yaml. I tried running the llama.cpp code in the following ways and here are the results (For reference: https://github.com/ggerganov/llama.cpp/discussions/1323#discussioncomment-5916462 has the following list which describes the parameters):

  • load time: loading model file
  • sample time: generating tokens from the prompt/file choosing the next likely token.
  • prompt eval time: how long it took to process the prompt/file by LLaMa before generating new text.
  • eval time: how long it took to generate the output (until [end of text] or the user set limit).
  • total: all together

Running a podman pod directly on the system:

lama_print_timings:        load time =    4430.66 ms
llama_print_timings:      sample time =      16.25 ms /   259 runs   (    0.06 ms per token, 15937.48 tokens per second)
llama_print_timings: prompt eval time =    1631.53 ms /     5 tokens (  326.31 ms per token,     3.06 tokens per second)
llama_print_timings:        eval time =   12403.26 ms /   258 runs   (   48.07 ms per token,    20.80 tokens per second)
llama_print_timings:       total time =   14076.18 ms /   263 tokens

Running a podman pod after ssh-ing into crc VM:

llama_print_timings:        load time =    3422.64 ms
llama_print_timings:      sample time =      50.78 ms /   649 runs   (    0.08 ms per token, 12781.38 tokens per second)
llama_print_timings: prompt eval time =    1780.76 ms /     5 tokens (  356.15 ms per token,     2.81 tokens per second)
llama_print_timings:        eval time =   38451.63 ms /   648 runs   (   59.34 ms per token,    16.85 tokens per second)
llama_print_timings:       total time =   40348.54 ms /   653 tokens

Running a kubernetes pod on crc (takes much longer):

llama_print_timings:        load time =   45553.22 ms
llama_print_timings:      sample time =      43.01 ms /   563 runs   (    0.08 ms per token, 13089.37 tokens per second)
llama_print_timings: prompt eval time =   44973.51 ms /     9 tokens ( 4997.06 ms per token,     0.20 tokens per second)
llama_print_timings:        eval time = 4552762.02 ms /   562 runs   ( 8101.00 ms per token,     0.12 tokens per second)
llama_print_timings:       total time = 4602622.83 ms /   571 tokens
cfergeau commented 2 months ago

Running a kubernetes pod on crc (takes much longer):

Could it be picking up an amd64 image instead of an arm64? This would explain the problems. You could try to get a shell inside the pod to try to understand what's happening, or try to compare commandlines in the VM to see if there are obvious differences

vyasgun commented 2 months ago

@cfergeau The image is arm64 (i checked inside the VM)

[core@api ~]$ sudo crictl inspecti quay.io/slopezpa/fedora-vgpu-llama | jq -r '.info.imageSpec.architecture'
arm64

And also inside the mistral-pod, the binary being run is built for arm64:

gvyas@Gunjans-MacBook-Pro specs % kubectl logs -f mistral-pod
Log start
main: build = 2238 (56d03d92)
main: built with cc (GCC) 13.2.1 20231205 (Red Hat 13.2.1-6) for aarch64-redhat-linux

Update

Running the pod as privileged was required for accessing the gpu. Now it takes roughly the same amount of time.

llama_print_timings:        load time =    4772.51 ms
llama_print_timings:      sample time =      58.51 ms /   669 runs   (    0.09 ms per token, 11433.36 tokens per second)
llama_print_timings: prompt eval time =    1780.24 ms /     5 tokens (  356.05 ms per token,     2.81 tokens per second)
llama_print_timings:        eval time =   40126.60 ms /   668 runs   (   60.07 ms per token,    16.65 tokens per second)
llama_print_timings:       total time =   42043.88 ms /   673 tokens
vyasgun commented 2 months ago

The next steps will be documented in: https://github.com/crc-org/crc/issues/4341