Issue with benchmark deployment #881

Open ymc101 opened 4 months ago

ymc101 commented 4 months ago

Hi, I am trying to test the deployment and invocation of the fibonacci benchmark on a knative vHive cluster by following the [readme].(https://github.com/vhive-serverless/vSwarm/tree/main/benchmarks/fibonacci) When starting the function, it timed out and says there is no ready Revision. Is there a way to solve this issue?

Below are some terminal logs from the master node:

yapm0011@node-0:~/vswarm/benchmarks/fibonacci$ kn service apply -f ./yamls/knative/kn-fibonacci-python.yaml
Warning: Kubernetes default value is insecure, Knative may default this to secure in a future release: spec.template.spec.containers[0].securityContext.allowPrivilegeEscalation, spec.template.spec.containers[0].securityContext.capabilities, spec.template.spec.containers[0].securityContext.runAsNonRoot, spec.template.spec.containers[0].securityContext.seccompProfile, spec.template.spec.containers[1].securityContext.allowPrivilegeEscalation, spec.template.spec.containers[1].securityContext.capabilities, spec.template.spec.containers[1].securityContext.runAsNonRoot, spec.template.spec.containers[1].securityContext.seccompProfile
Creating service 'fibonacci-python' in namespace 'default':

  0.045s The Route is still working to reflect the latest desired specification.
  0.074s ...
  0.115s Configuration "fibonacci-python" is waiting for a Revision to become ready.
Error: timeout: service 'fibonacci-python' not ready after 600 seconds
yapm0011@node-0:~/vswarm/benchmarks/fibonacci$ kn service list --all-namespaces
NAMESPACE   NAME               URL                                                      LATEST                AGE   CONDITIONS   READY   REASON
default     fibonacci-python   http://fibonacci-python.default.                         13m   0 OK / 3     False   RevisionMissing : Configuration "fibonacci-python" does not have any ready Revision.
default     helloworld-0       http://helloworld-0.default.       helloworld-0-00001    48m   3 OK / 3     True    
default     pyaes-0            http://pyaes-0.default.            pyaes-0-00001         48m   3 OK / 3     True    
default     pyaes-1            http://pyaes-1.default.            pyaes-1-00001         48m   3 OK / 3     True    
default     rnn-serving-0      http://rnn-serving-0.default.      rnn-serving-0-00001   48m   3 OK / 3     True    
default     rnn-serving-1      http://rnn-serving-1.default.      rnn-serving-1-00001   48m   3 OK / 3     True    
default     rnn-serving-2      http://rnn-serving-2.default.      rnn-serving-2-00001   48m   3 OK / 3     True
yapm0011@node-0:~/vswarm$ kubectl get pods -A
NAMESPACE          NAME                                                                            READY   STATUS      RESTARTS   AGE
istio-system       cluster-local-gateway-6f45748884-zvmb5                                          1/1     Running     0          76m
istio-system       istio-ingressgateway-7975cbbc47-795kn                                           1/1     Running     0          76m
istio-system       istiod-77d9cd6b46-cgppt                                                         1/1     Running     0          76m
knative-eventing   eventing-controller-779445884c-m6qkr                                            1/1     Running     0          75m
knative-eventing   eventing-webhook-6f499f8479-jdxpc                                               1/1     Running     0          75m
knative-eventing   imc-controller-7985996c59-t4rl5                                                 1/1     Running     0          75m
knative-eventing   imc-dispatcher-7cfb975895-xcpnv                                                 1/1     Running     0          75m
knative-eventing   mt-broker-controller-dbd664566-66w8v                                            1/1     Running     0          75m
knative-eventing   mt-broker-filter-6bfdb744fc-crp9c                                               1/1     Running     0          75m
knative-eventing   mt-broker-ingress-596fcd6964-k8x7r                                              1/1     Running     0          75m
knative-serving    activator-58db57894b-fglvf                                                      1/1     Running     0          75m
knative-serving    autoscaler-76f95fff78-ns8p2                                                     1/1     Running     0          75m
knative-serving    controller-7dd875844b-tpx6s                                                     1/1     Running     0          75m
knative-serving    default-domain-v5p5w                                                            0/1     Completed   0          75m
knative-serving    net-istio-controller-57486f879-l55k4                                            1/1     Running     0          75m
knative-serving    net-istio-webhook-7ccdbcb557-nrvsv                                              1/1     Running     0          75m
knative-serving    webhook-d8674645d-bk2mc                                                         1/1     Running     0          75m
kube-system        calico-kube-controllers-68cdf756d9-9pjdw                                        1/1     Running     0          76m
kube-system        calico-node-jcmd4                                                               1/1     Running     0          76m
kube-system        calico-node-sgfhf                                                               1/1     Running     0          76m
kube-system        coredns-76f75df574-44sn4                                                        1/1     Running     0          77m
kube-system        coredns-76f75df574-knmzt                                                        1/1     Running     0          77m
kube-system        etcd-node-0.yapm0011-195487.ntuvhive1-pg0.utah.cloudlab.us                      1/1     Running     0          77m
kube-system        kube-apiserver-node-0.yapm0011-195487.ntuvhive1-pg0.utah.cloudlab.us            1/1     Running     0          77m
kube-system        kube-controller-manager-node-0.yapm0011-195487.ntuvhive1-pg0.utah.cloudlab.us   1/1     Running     0          77m
kube-system        kube-proxy-tn4rx                                                                1/1     Running     0          77m
kube-system        kube-proxy-z7kmm                                                                1/1     Running     0          77m
kube-system        kube-scheduler-node-0.yapm0011-195487.ntuvhive1-pg0.utah.cloudlab.us            1/1     Running     0          77m
metallb-system     controller-5f56cd6f78-2fwcc                                                     1/1     Running     0          76m
metallb-system     speaker-85lnb                                                                   1/1     Running     0          76m
metallb-system     speaker-hw5jk                                                                   1/1     Running     0          76m
registry           docker-registry-pod-6ck9z                                                       1/1     Running     0          75m
registry           registry-etc-hosts-update-pvbwp                                                 1/1     Running     0          75m
ymc101 commented 4 months ago

hi @leokondrashov do you have an idea what is causing this issue?

leokondrashov commented 4 months ago

I have little idea, why it can happen. Please provide description of the failing revision: kubectl describe revision fibonacci-python and a pod, while it exists. Better to describe deployment and pod while it times out for proper understanding of what's hindering the deployment.

ymc101 commented 4 months ago

This is the description:

yapm0011@node-0:~/vswarm/benchmarks/fibonacci$ kubectl describe revision fibonacci-python
Name:         fibonacci-python-00001
Namespace:    default
Labels:       serving.knative.dev/configuration=fibonacci-python
Annotations:  serving.knative.dev/creator: kubernetes-admin
              serving.knative.dev/routes: fibonacci-python
              serving.knative.dev/routingStateModified: 2024-03-09T08:33:05Z
API Version:  serving.knative.dev/v1
Kind:         Revision
  Creation Timestamp:  2024-03-09T08:33:05Z
  Generation:          1
  Owner References:
    API Version:           serving.knative.dev/v1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Configuration
    Name:                  fibonacci-python
    UID:                   9b15e7f2-4791-4015-a8b7-fbc3a84b0ab5
  Resource Version:        15516
  UID:                     6148fb13-695d-44f4-be8f-31c45bdd3350
  Container Concurrency:  0
    Image:  docker.io/vhiveease/relay:latest
    Name:   user-container-0
      Container Port:  50000
      Name:            h2c
      Protocol:        TCP
    Readiness Probe:
      Success Threshold:  1
      Tcp Socket:
        Port:  0
    Image:  docker.io/vhiveease/fibonacci-python:latest
    Name:   user-container-1
  Enable Service Links:  false
  Timeout Seconds:       300
  Actual Replicas:  0
    Last Transition Time:  2024-03-09T08:43:36Z
    Message:               The target is not receiving traffic.
    Reason:                NoTraffic
    Severity:              Info
    Status:                False
    Type:                  Active
    Last Transition Time:  2024-03-09T08:33:06Z
    Reason:                Deploying
    Status:                Unknown
    Type:                  ContainerHealthy
    Last Transition Time:  2024-03-09T08:43:36Z
    Message:               Initial scale was never achieved
    Reason:                ProgressDeadlineExceeded
    Status:                False
    Type:                  Ready
    Last Transition Time:  2024-03-09T08:43:36Z
    Message:               Initial scale was never achieved
    Reason:                ProgressDeadlineExceeded
    Status:                False
    Type:                  ResourcesAvailable
  Container Statuses:
    Image Digest:       index.docker.io/vhiveease/relay@sha256:605a16487dbff86f2e04875df6f338913ea02342aa664f4d14d3d8d71b8c697b
    Name:               user-container-0
    Image Digest:       index.docker.io/vhiveease/fibonacci-python@sha256:938172f0d1de67a4cb159b4a0b8606be03bc5b1a2c954dc5f5c31b66327965c9
    Name:               user-container-1
  Observed Generation:  1
  Type     Reason         Age                From                 Message
  ----     ------         ----               ----                 -------
  Warning  InternalError  10m (x2 over 20m)  revision-controller  failed to update deployment "fibonacci-python-00001-deployment": Operation cannot be fulfilled on deployments.apps "fibonacci-python-00001-deployment": the object has been modified; please apply your changes to the latest version and try again
leokondrashov commented 3 months ago

What are the steps to reproduce? How exactly are you trying to deploy the Fibonacci benchmark?

ymc101 commented 3 months ago

After setting up the vHive nodes, and running the deployer client (source /etc/profile && pushd ./tools/deployer && go build && popd && ./tools/deployer/deployer -funcPath ~/vhive/configs/knative_workloads) and the invoker client (pushd ./tools/invoker && go mod tidy && go build && popd && ./tools/invoker/invoker) once, i did make pull in the Fibonacci directory and then kn service apply -f ./yamls/knative/kn-fibonacci-python.yaml as I recall. Am I missing some steps in between?

leokondrashov commented 3 months ago

@dhschall Hi, can you help over here? The steps look according to the docs. This error doesn't seem to be connected to the absence of the stub container needed for Firecracker in vHive.

dhschall commented 3 months ago

Hi, sorry for the delayed response. This is the latest run from the continuous integration. The function seems ok.

Can you try to follow those steps to see if its working with the normal k8 cluster:

# Deploy the function
kubectl apply -f benchmarks/fibonacci/yamls/knative//kn-fibonacci-python.yaml

# Get the logs from the Fibonacci container
kubectl logs -n default -c user-container-0 -l serving.knative.dev/service=fibonacci-python

# If that works try to invoke with the test client
go build ./test-client.go

./test-client --addr fibonacci-python.default. --name 'Example text for CI'

According to your message the second step fail. In that case please send the logs.


ymc101 commented 2 months ago

@dhschall Hi, I tried the steps listed and got some error. Below are the terminal logs:

yapm0011@node-0:~/vswarm$ kubectl apply -f benchmarks/fibonacci/yamls/knative//kn-fibonacci-python.yaml
Warning: Kubernetes default value is insecure, Knative may default this to secure in a future release: spec.template.spec.containers[0].securityContext.allowPrivilegeEscalation, spec.template.spec.containers[0].securityContext.capabilities, spec.template.spec.containers[0].securityContext.runAsNonRoot, spec.template.spec.containers[0].securityContext.seccompProfile, spec.template.spec.containers[1].securityContext.allowPrivilegeEscalation, spec.template.spec.containers[1].securityContext.capabilities, spec.template.spec.containers[1].securityContext.runAsNonRoot, spec.template.spec.containers[1].securityContext.seccompProfile
service.serving.knative.dev/fibonacci-python created
yapm0011@node-0:~/vswarm$ kubectl logs -n default -c user-container-0 -l serving.knative.dev/service=fibonacci-python
time="2024-04-16T08:03:38Z" level=info msg="Connect to"
time="2024-04-16T08:03:48Z" level=info msg="Started relay server at\n"
time="2024-04-16T08:03:48Z" level=info msg="Downstream function: fibonacci-python at addr\n"
time="2024-04-16T08:03:48Z" level=info msg="Input generator: linear, bound: [1:10]\n"
yapm0011@node-0:~/vswarm$ go build ./test-client.go
go: cannot find main module, but found .git/config in /users/yapm0011/vswarm
        to create a module there, run:
        go mod init
dhschall commented 2 months ago

Oh sorry my bad. You have build the test client in the corresponding directory

cd tools/test-client/
go build ./test-client.go
./test-client --addr fibonacci-python.default. --name 'Example text for CI'