bitcoin-dev-project / warnet

Monitor and analyze the emergent behaviors of Bitcoin networks
https://warnet.dev
MIT License
59 stars 27 forks source link

make `just start` run warnet-rpc:latest #345

Closed mplsgrant closed 3 weeks ago

mplsgrant commented 4 weeks ago

Issue

When I run just start on my vm, it fails.

$ kubectl describe pod rpc-0 ``` Name: rpc-0 Namespace: warnet Priority: 0 Service Account: default Node: minikube/192.168.49.2 Start Time: Fri, 19 Apr 2024 18:58:23 +0000 Labels: apps.kubernetes.io/pod-index=0 controller-revision-hash=rpc-7bd75bbd77 io.kompose.service=rpc statefulset.kubernetes.io/pod-name=rpc-0 Annotations: Status: Running IP: 10.244.0.2 IPs: IP: 10.244.0.2 Controlled By: StatefulSet/rpc Containers: warnet-rpc: Container ID: docker://d71a80f8b79a0da95e108f6242a8552f200bbcfad5fee08f7ad9faf612139378 Image: bitcoindevproject/warnet-rpc:dev Image ID: docker-pullable://bitcoindevproject/warnet-rpc@sha256:a94f9b872c52414b0dffcc255c8b02386c037476b0ef66267d1c36c83066b381 Port: 9276/TCP Host Port: 0/TCP State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 127 Started: Fri, 19 Apr 2024 19:02:56 +0000 Finished: Fri, 19 Apr 2024 19:03:26 +0000 Ready: False Restart Count: 5 Liveness: exec [/bin/bash -c /root/warnet/src/templates/rpc/livenessProbe.sh] delay=20s timeout=1s period=5s #success=1 #failure=3 Readiness: http-get http://:9276/-/healthy delay=1s timeout=2s period=2s #success=1 #failure=2 Environment: Mounts: /root/warnet from source-code (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-k7gmc (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: source-code: Type: HostPath (bare host directory volume) Path: /mnt/src HostPathType: Directory kube-api-access-k7gmc: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 6m16s default-scheduler Successfully assigned warnet/rpc-0 to minikube Normal Pulled 6m11s kubelet Successfully pulled image "bitcoindevproject/warnet-rpc:dev" in 4.858s (4.858s including waiting) Normal Created 6m10s kubelet Created container warnet-rpc Normal Started 6m9s kubelet Started container warnet-rpc Warning Unhealthy 5m41s (x2 over 5m46s) kubelet Liveness probe failed: /bin/bash: line 1: /root/warnet/src/templates/rpc/livenessProbe.sh: No such file or directory Normal Pulling 5m39s (x2 over 6m15s) kubelet Pulling image "bitcoindevproject/warnet-rpc:dev" Normal Pulled 5m38s kubelet Successfully pulled image "bitcoindevproject/warnet-rpc:dev" in 696ms (696ms including waiting) Warning Unhealthy 74s (x103 over 6m8s) kubelet Readiness probe failed: Get "http://10.244.0.2:9276/-/healthy": dial tcp 10.244.0.2:9276: connect: connection refused ```

Cause

Currently, just start (and also just startd) runs the warnet-rpc:dev container and the accompanying liveness probe and volume configurations.

Solution

Point just start to warnet-rpc-statefulset.yaml which pulls in warnet-rpc:latest

Results

I tested running warnet with my tweak, and it works great.

pinheadmz commented 4 weeks ago

just start and just startd are intended for local warnet development and thats why they apply the -dev statefulset. The -dev image copies the local warnet rpc code into the container so developers can test code changes in kubernetes. The non-dev image (latest) is built from the top commit on main branch and does not require the local filesystem at all, its meant for production.

So instead of changing the existing commands, lets just add new ones for your use case? Or try to figure out the deeper meaning behind this:

Warning Unhealthy 5m41s (x2 over 5m46s) kubelet Liveness probe failed: /bin/bash: line 1: /root/warnet/src/templates/rpc/livenessProbe.sh: No such file or directory

mplsgrant commented 3 weeks ago

After a bit of mucking around, I discovered that when I run minikube from a debian os, it does not have the 9p filesystem enabled. However, when I run from ubuntu, it does. The current github action tests use ubuntu, which explains why the automated github tests work.

pinheadmz commented 3 weeks ago

This is similar to a common issue during the attackathon as well. Here's an example from someone using Arch Linux ("or it might have been a nix shell")

Warning  FailedMount  35s (x10 over 4m45s)  kubelet
           MountVolume.SetUp failed for volume "source-code" : hostPath type check failed: /mnt/src is not a directory
mplsgrant commented 3 weeks ago

Interesting. Would like to see their yaml file so I could rule out them going in a messing around with the config. I got a similar issue when I did that.

pinheadmz commented 3 weeks ago

It's the statefulset-dev.yaml in templates which pulls the dev rpc image which in turn copies in files from the host

mplsgrant commented 3 weeks ago

I believe the debian cloud image does not (and will not) include 9p:

Virtio-fs appears to be the future for this type of use-case, and we do enable this for bullseye cloud kernels. I'd much rather encourage its use instead of the crufty old abuse of a network filesystem that is 9p.

Here's the view from ubuntu and from debian which shows the different kernels in use.

grant@local:~/warnet$ cat /etc/os-release 
PRETTY_NAME="Ubuntu 23.10"
<snip>
grant@local:~/warnet$ minikube ssh
docker@minikube:~$ lsmod | grep 9p
9pnet_fd               20480  1
9p                     77824  1
9pnet                 106496  2 9p,9pnet_fd
fscache               389120  1 9p
netfs                  61440  2 9p,fscache
docker@minikube:~$ uname -a
Linux minikube 6.5.0-1016-gcp #16-Ubuntu SMP Fri Mar  8 20:37:20 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
grant@local:~/warnet$ cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
<snip>
grant@local:~/warnet$ minikube ssh
docker@minikube:~$ lsmod | grep 9p
docker@minikube:~$ 
docker@minikube:~$ uname -a
Linux minikube 6.1.0-20-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.85-1 (2024-04-11) x86_64 x86_64 x86_64 GNU/Linux
mplsgrant commented 3 weeks ago

I am beginning to think we need to use something like rsync instead of minikube mount to get files from the host to the vm for the dev environment.

mplsgrant commented 3 weeks ago

I'm going to close this request because my solution is no longer relevant due to what we now know about 9p.

I opened an issue at the minikube repo, and it appears that there is nothing that minikube can do about the 9p issue.