redhat-developer / odo

odo - Developer-focused CLI for fast & iterative container-based application development on Podman and Kubernetes. Implementation of the open Devfile standard.
https://odo.dev
Apache License 2.0
791 stars 244 forks source link

Unexpected EOF during watch stream event decoding, watch channel was closed. #3905

Closed sarveshtamba closed 10 months ago

sarveshtamba commented 4 years ago

Facing the following issue on ppc64le while running the odo test suites:-

[odo]
I0906 18:07:32.091660  203070 streamwatcher.go:114] Unexpected EOF during watch stream event decoding: unexpected EOF
[odo] I0906 18:07:32.091772  203070 streamwatcher.go:114] Unexpected EOF during watch stream event decoding: unexpected EOF
[odo]  
  Watch channel was closed
[odo]  
  Waiting for component to start [3m] [WARNING x10: BackOff]
[odo]  
  watch channel was closed

Attached is the error log - EOF_watchstream_ppc64le.txt

amitkrout commented 4 years ago

@sarveshtamba Have you tried the same scenario manually on such platform ?

sarveshtamba commented 4 years ago

No, however this works fine on another OCP 4.5 cluster. Want to confirm if this is a flaky scenario or if this is observed on any other platforms?

amitkrout commented 4 years ago

No, however this works fine on another OCP 4.5 cluster.

I did not get it. Can you please explain what do you mean by another OCP 4.5 cluster ? i guess it works against 4.5 cluster but fails on rest cluster version

Want to confirm if this is a flaky scenario or if this is observed on any other platforms?

I have never seen it on amd64 images i mean in our CI environment and locally too.

sarveshtamba commented 4 years ago

No, however this works fine on another OCP 4.5 cluster.

I did not get it. Can you please explain what do you mean by another OCP 4.5 cluster ? i guess it works against 4.5 cluster but fails on rest cluster version

I have tried running the same test suites on 2 different OCP 4.5 clusters. In one cluster I don't see these EOF errors, while in the other I see these EOF errors, hence want to confirm if this is a flaky issue or is it local setup issue?

Want to confirm if this is a flaky scenario or if this is observed on any other platforms?

I have never seen it on amd64 images i mean in our CI environment and locally too.

sarveshtamba commented 4 years ago

https://access.redhat.com/solutions/2092671

sarveshtamba commented 4 years ago

This appears to be a OCP cluster specific issue, not encountering this on another OCP cluster. Closing this for now, will re-open if required.

sarveshtamba commented 4 years ago

Hitting this issue on another cluster too while running odo test suites. Reopening this one.

sarveshtamba commented 4 years ago

cc:- @amitkrout @scottkurz ^^

kadel commented 4 years ago

Checked your logs. The important information is this:

[odo] I0906 18:06:32.090832  203070 occlient.go:1843] Warning Event: Count: 10, Reason: BackOff, Message: Back-off restarting failed container

The container was killed by cluster because it did not started in 3mins, this is the reason for "Unexpected EOF".

I can't see why it didn't started but my guess would be that cluster couldn't download the image or it took too long to download the image (registry.redhat.io/rhscl/nodejs-12-rhel7@sha256:f02a15704dad16bfe1a478ddd02fa425201a9873e53aa84498b41cae3412ecff)

sarveshtamba commented 3 years ago

@kadel seeing this on OCP 4.6 cluster as well. Any suggestions? Is there anyone who can help here?

kadel commented 3 years ago

One way to get more info about this would be to verify if it is odo or cluster. For example, you could try to create 10 Pods that are using an image from my previous commend, and check if some of them fail

sarveshtamba commented 3 years ago

@kadel Tried manually creating the steps used in one of the failing tests, below are the oc describe and oc logs results:

[root@rhodopowerci-inf 603110024]# mkdir /tmp/825666372
[root@rhodopowerci-inf 603110024]# cd /tmp/825666372
[root@rhodopowerci-inf 825666372]# odo project create iyisbvflno -w -v4
I1123 01:39:58.274073    2714 util.go:730] HTTPGetRequest: https://raw.githubusercontent.com/openshift/odo/master/build/VERSION
I1123 01:39:58.274320    2714 util.go:751] Response will be cached in /tmp/odohttpcache for 1h0m0s
 •  Waiting for project to come up  ...
I1123 01:39:58.601781    2714 util.go:764] Cached response used.
I1123 01:39:59.348170    2714 occlient.go:531] Status of creation of project iyisbvflno is Active
I1123 01:39:59.348214    2714 occlient.go:536] Project iyisbvflno now exists
I1123 01:39:59.353013    2714 namespace.go:181] Status of creation of service account &ServiceAccount{ObjectMeta:{default  iyisbvflno /api/v1/namespaces/iyisbvflno/serviceaccounts/default d4f183e4-2828-4eca-aa66-c3b4e3a6e1d2 31773856 0 2020-11-23 01:39:59 -0800 PST <nil> <nil> map[] map[] [] []  []},Secrets:[]ObjectReference{ObjectReference{Kind:,Namespace:,Name:default-dockercfg-ms6df,UID:,APIVersion:,ResourceVersion:,FieldPath:,},ObjectReference{Kind:,Namespace:,Name:default-token-xsgbk,UID:,APIVersion:,ResourceVersion:,FieldPath:,},},ImagePullSecrets:[]LocalObjectReference{LocalObjectReference{Name:default-dockercfg-ms6df,},},AutomountServiceAccountToken:nil,} is ready
 ✓  Waiting for project to come up [1s]
 ✓  Project 'iyisbvflno' is ready for use
 ✓  New project created and now using project: iyisbvflno

---
A newer version of odo (v2.0.1) is available,
visit https://github.com/openshift/odo/releases to update.
If you wish to disable this notification, run:
odo preference set UpdateNotification false
---
[root@rhodopowerci-inf 825666372]# odo component create --s2i nodejs mynodejs --project iyisbvflno --context /tmp/825666372 --app app --s2i
Validation
 ✓  Validating component [17ms]

Please use `odo push` command to create the component with source deployed

---
A newer version of odo (v2.0.1) is available,
visit https://github.com/openshift/odo/releases to update.
If you wish to disable this notification, run:
odo preference set UpdateNotification false
---
[root@rhodopowerci-inf 825666372]# odo url create url1 --port 8080 --context /tmp/825666372
 ✓  URL url1 created for component: mynodejs

To apply the URL configuration changes, please use `odo push`

---
A newer version of odo (v2.0.1) is available,
visit https://github.com/openshift/odo/releases to update.
If you wish to disable this notification, run:
odo preference set UpdateNotification false
---
[root@rhodopowerci-inf 825666372]# odo storage create storage1 --path /data1 --size 1Gi --context /tmp/825666372
 ✓  Added storage storage1 to mynodejs

Please use `odo push` command to make the storage accessible to the component

---
A newer version of odo (v2.0.1) is available,
visit https://github.com/openshift/odo/releases to update.
If you wish to disable this notification, run:
odo preference set UpdateNotification false
---

[root@rhodopowerci-inf 825666372]# odo push --context /tmp/825666372
Validation
 ✓  Checking component [42ms]

Configuration changes
 ✓  Added storage storage1 to mynodejs
 ✓  Initializing component
 ✓  Creating component [232ms]

Applying URL changes
 ✓  URL url1: http://url1-app-iyisbvflno.apps.rhodopowerci.cp.fyre.ibm.com/ created

Pushing to component mynodejs of type local
 ✓  Checking files for pushing [616471ns]
 ⚠  Watch channel was closed start [⚠ WARNING x9: BackOff]
 ✗  Waiting for component to start [3m] [WARNING x9: BackOff]
 ✗  watch channel was closed

[root@rhodopowerci-inf 825666372]# oc get all
NAME                        READY   STATUS   RESTARTS   AGE
pod/mynodejs-app-1-deploy   0/1     Error    0          14m

NAME                                   DESIRED   CURRENT   READY   AGE
replicationcontroller/mynodejs-app-1   0         0         0       14m

NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/mynodejs-app   ClusterIP   172.30.107.47   <none>        8080/TCP   14m

NAME                                              REVISION   DESIRED   CURRENT   TRIGGERED BY
deploymentconfig.apps.openshift.io/mynodejs-app   1          1         0         config,image(nodejs:latest)

NAME                                          IMAGE REPOSITORY                                                                                   TAGS   UPDATED
imagestream.image.openshift.io/mynodejs-app   default-route-openshift-image-registry.apps.rhodopowerci.cp.fyre.ibm.com/iyisbvflno/mynodejs-app

NAME                                HOST/PORT                                               PATH   SERVICES       PORT   TERMINATION   WILDCARD
route.route.openshift.io/url1-app   url1-app-iyisbvflno.apps.rhodopowerci.cp.fyre.ibm.com   /      mynodejs-app   8080                 None
[root@rhodopowerci-inf 825666372]# oc describe pod/mynodejs-app-1-deploy
Name:         mynodejs-app-1-deploy
Namespace:    iyisbvflno
Priority:     0
Node:         worker0.rhodopowerci.cp.fyre.ibm.com/10.17.73.98
Start Time:   Mon, 23 Nov 2020 01:41:04 -0800
Labels:       openshift.io/deployer-pod-for.name=mynodejs-app-1
Annotations:  k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "openshift-sdn",
                    "interface": "eth0",
                    "ips": [
                        "10.254.12.15"
                    ],
                    "default": true,
                    "dns": {}
                }]
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "openshift-sdn",
                    "interface": "eth0",
                    "ips": [
                        "10.254.12.15"
                    ],
                    "default": true,
                    "dns": {}
                }]
              openshift.io/deployment-config.name: mynodejs-app
              openshift.io/deployment.name: mynodejs-app-1
              openshift.io/scc: restricted
Status:       Failed
IP:           10.254.12.15
IPs:
  IP:  10.254.12.15
Containers:
  deployment:
    Container ID:   cri-o://fabbb5d524158ed68bd9a977169ba5eb5b2bd5f789d371839dde88aaf041f3af
    Image:          quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a80a1fcfc1b6bdfb224716fb0bfabba78460481bd08a6161aef308993c9e6314
    Image ID:       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a80a1fcfc1b6bdfb224716fb0bfabba78460481bd08a6161aef308993c9e6314
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 23 Nov 2020 01:41:07 -0800
      Finished:     Mon, 23 Nov 2020 01:51:09 -0800
    Ready:          False
    Restart Count:  0
    Environment:
      OPENSHIFT_DEPLOYMENT_NAME:       mynodejs-app-1
      OPENSHIFT_DEPLOYMENT_NAMESPACE:  iyisbvflno
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from deployer-token-rcx2w (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  deployer-token-rcx2w:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  deployer-token-rcx2w
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason          Age   From                                           Message
  ----    ------          ----  ----                                           -------
  Normal  Scheduled       14m   default-scheduler                              Successfully assigned iyisbvflno/mynodejs-app-1-deploy to worker0.rhodopowerci.cp.fyre.ibm.com
  Normal  AddedInterface  14m   multus                                         Add eth0 [10.254.12.15/22]
  Normal  Pulled          14m   kubelet, worker0.rhodopowerci.cp.fyre.ibm.com  Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a80a1fcfc1b6bdfb224716fb0bfabba78460481bd08a6161aef308993c9e6314" already present on machine
  Normal  Created         14m   kubelet, worker0.rhodopowerci.cp.fyre.ibm.com  Created container deployment
  Normal  Started         14m   kubelet, worker0.rhodopowerci.cp.fyre.ibm.com  Started container deployment
[root@rhodopowerci-inf 825666372]#
[root@rhodopowerci-inf 825666372]# oc logs pod/mynodejs-app-1-deploy
--> Scaling mynodejs-app-1 to 1
error: update acceptor rejected mynodejs-app-1: pods for rc 'iyisbvflno/mynodejs-app-1' took longer than 600 seconds to become available
kadel commented 3 years ago

Hmm, I don't see any indicators of what could cause the failure :-( Can you please try one more time, but this time run include -v 4 argument in odo push command?

Also while the odo push is running, can you run odo describe pod?

sarveshtamba commented 3 years ago

@kadel oc describe shows similar output as before. odo push -v 4 is as below:

[root@rhodopowerci-inf 825666372]# odo project set iyisbvflno
Already on project : iyisbvflno
[root@rhodopowerci-inf 825666372]# odo push --context /tmp/825666372 -v 4
I1123 02:38:52.087808    2279 util.go:425] path /tmp/825666372/devfile.yaml doesn't exist, skipping it
I1123 02:38:52.087813    2279 util.go:730] HTTPGetRequest: https://raw.githubusercontent.com/openshift/odo/master/build/VERSION
I1123 02:38:52.088096    2279 util.go:751] Response will be cached in /tmp/odohttpcache for 1h0m0s
I1123 02:38:52.117136    2279 common_push.go:165] SourceLocation: ./
I1123 02:38:52.117167    2279 common_push.go:173] Source Path: /tmp/825666372
I1123 02:38:52.139143    2279 util.go:425] path /tmp/825666372/devfile.yaml doesn't exist, skipping it
Validation
 •  Checking component  ...
I1123 02:38:52.139262    2279 occlient.go:2781] Getting DeploymentConfig: mynodejs-app
I1123 02:38:52.215887    2279 component.go:532] Checking source location: ./
I1123 02:38:52.215933    2279 component.go:561] Validating configured memory values
I1123 02:38:52.215949    2279 component.go:561] Validating configured cpu values
 ✓  Checking component [76ms]
I1123 02:38:52.216016    2279 util.go:425] path /tmp/825666372/devfile.yaml doesn't exist, skipping it

Configuration changes
 •  Retrieving component data  ...
I1123 02:38:52.260697    2279 occlient.go:2781] Getting DeploymentConfig: mynodejs-app
I1123 02:38:52.285906    2279 pushed_component.go:198] Source for component mynodejs is  (local)
I1123 02:38:52.285958    2279 occlient.go:2781] Getting DeploymentConfig: mynodejs-app
I1123 02:38:52.312500    2279 component.go:1226] Updating component mynodejs, from local to ./ (local).
 ✓  Retrieving component data [96ms]
 •  Applying configuration  ...
I1123 02:38:52.339356    2279 occlient.go:764] Found exact image tag match for nodejs:latest
I1123 02:38:52.398184    2279 util.go:764] Cached response used.
 ✓  Applying configuration [121ms]

Applying URL changes
I1123 02:38:52.440321    2279 url.go:323] Listing routes with label selector: app.kubernetes.io/part-of=app,app.kubernetes.io/instance=mynodejs
I1123 02:38:52.440343    2279 occlient.go:2640] Listing routes with label selector: app.kubernetes.io/part-of=app,app.kubernetes.io/instance=mynodejs
 ✓  URLs are synced with the cluster, no changes are required.

Pushing to component mynodejs of type local
I1123 02:38:52.456734    2279 file_indexer.go:214] file added: /tmp/825666372/.gitignore
I1123 02:38:52.456819    2279 file_indexer.go:202] .odo or .git directory detected, skipping it
 ✓  Checking file changes for pushing [339731ns]
I1123 02:38:52.456904    2279 common_push.go:245] List of files to be deleted: +[]
I1123 02:38:52.456930    2279 common_push.go:268] Copying directory /tmp/825666372 to pod
I1123 02:38:52.456950    2279 component.go:650] PushLocal: componentName: mynodejs, applicationName: app, path: /tmp/825666372, files: [/tmp/825666372/.gitignore], delFiles: [], isForcePush: false
I1123 02:38:52.468862    2279 preference.go:209] The path for preference file is /root/.odo/preference.yaml
I1123 02:38:52.468912    2279 occlient.go:1871] Waiting for deploymentconfig=mynodejs-app pod
 •  Waiting for component to start  ...
I1123 02:38:52.484990    2279 occlient.go:1842] Warning Event: Count: 22, Reason: BackOff, Message: Back-off restarting failed container
I1123 02:39:52.488570    2279 streamwatcher.go:114] Unexpected EOF during watch stream event decoding: unexpected EOF
 ⚠  Watch channel was closed
I1123 02:39:52.488724    2279 streamwatcher.go:114] Unexpected EOF during watch stream event decoding: unexpected EOF
 ✗  Waiting for component to start [1m] [WARNING x22: BackOff]
 ✗  watch channel was closed
sarveshtamba commented 3 years ago

@kadel ^^

openshift-bot commented 3 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 3 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

mohammedzee1000 commented 3 years ago

/lifecycle frozen

kadel commented 1 year ago

@sarveshtamba Do you still see the same errors with odo v3?

github-actions[bot] commented 11 months ago

A friendly reminder that this issue had no activity for 90 days. Stale issues will be closed after an additional 30 days of inactivity.

github-actions[bot] commented 10 months ago

This issue was closed because it has been inactive for 30 days since being marked as stale.