kubevirt / hyperconverged-cluster-operator

Operator pattern for managing multi-operator products
Apache License 2.0
154 stars 152 forks source link

kubevirt-web-ui: Failed to pull image "quay.io/kubevirt/kubevirt-web-ui:v0.1.10": rpc error: code = Unknown desc = Error reading manifest v0.1.10 in quay.io/kubevirt/kubevirt-web-ui: manifest unknown: manifest unknown #160

Closed dhiller closed 5 years ago

dhiller commented 5 years ago

Related to #143, retrying installing the latest version of hco from master on kubevirtci I wanted to connect to kubevirt console and fetched the route:

$ oc get route  console -n kubevirt-web-ui                                                                                                                                              1 ↵
NAME      HOST/PORT                                PATH   SERVICES   PORT    TERMINATION          WILDCARD
console   kubevirt-web-ui.apps.test-1.tt.testing          console    https   reencrypt/Redirect   None

Trying to browse towards this produced this output:

Application is not available
The application is currently not serving requests at this endpoint. It may not have been started or is still starting.
$ curl -v --insecure --HEAD https://kubevirt-web-ui.apps.test-1.tt.testing
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to kubevirt-web-ui.apps.test-1.tt.testing (127.0.0.1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: CN=*.apps.test-1.tt.testing
*  start date: Jun 23 12:01:30 2019 GMT
*  expire date: Jun 22 12:01:31 2021 GMT
*  issuer: CN=ingress-operator@1561291288
*  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
> HEAD / HTTP/1.1
> Host: kubevirt-web-ui.apps.test-1.tt.testing
> User-Agent: curl/7.64.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 503 Service Unavailable
HTTP/1.0 503 Service Unavailable
< Pragma: no-cache
Pragma: no-cache
< Cache-Control: private, max-age=0, no-cache, no-store
Cache-Control: private, max-age=0, no-cache, no-store
< Connection: close
Connection: close
< Content-Type: text/html
Content-Type: text/html

< 
* Excess found in a non pipelined read: excess = 3131 url = / (zero-length body)
* Closing connection 0

I got the following logs for kubevirt-web-ui console:

$ kubectl get pods -n kubevirt-web-ui                         
NAME                       READY   STATUS             RESTARTS   AGE
console-64544b6686-hgvgk   0/1     ImagePullBackOff   0          18m
console-64544b6686-lcjnx   0/1     ImagePullBackOff   0          18m
$ kubectl logs -n kubevirt-web-ui console-64544b6686-hgvgk
Error from server (BadRequest): container "console" in pod "console-64544b6686-hgvgk" is waiting to start: trying and failing to pull image

Fortunately the okd console was up, so I went to https://console-openshift-console.apps.test-1.tt.testing/k8s/ns/kubevirt-web-ui/pods/console-64544b6686-hgvgk/events and there I found what image was the problem:

Failed to pull image "quay.io/kubevirt/kubevirt-web-ui:v0.1.10": rpc error: code = Unknown desc = Error reading manifest v0.1.10 in quay.io/kubevirt/kubevirt-web-ui: manifest unknown: manifest unknown
mcornea commented 5 years ago

I am seeing the same issue, it looks like a regression introduced by https://github.com/kubevirt/hyperconverged-cluster-operator/commit/9ca9f95313c4b5e28f168929a799b8affe02f2a6

tiraboschi commented 5 years ago

web-ui-operator versions comes from https://github.com/kubevirt/web-ui-operator/releases where the latest is v0.1.10 while web-ui versions comes from https://github.com/kubevirt/web-ui/releases where the latest is v2.0.0-14.8

So from our CSV we need to have the Hyperconverged Cluster Operator installing web-ui-operator v0.1.10 asking it (via WEB_UI_TAG env variable) to install web-ui v2.0.0-14.8 Automatically setting WEB_UI_TAG according to web-ui-operator is definitively a bad idea because as for this issue it will end with v0.1.10 isntead of v2.0.0-14.8 and HCO has no dependency on web-ui but just web-ui-operator by design.

As for the initial issue ( https://github.com/kubevirt/hyperconverged-cluster-operator/issues/143 ) that lead to https://github.com/kubevirt/hyperconverged-cluster-operator/pull/152 we still have to find a way to get web-ui version at HCO build time to "compose" a release because using "latest" can just hide other problems.

On oVirt project for instance we are using manually handled release files with all the versions we want to include in a specific compose, see for example https://github.com/oVirt/releng-tools/blob/master/milestones/ovirt-4.3.5.conf (where all the history is available under https://github.com/oVirt/releng-tools/tree/master/milestones )

Composing a release is still a manual process, but at least there is just a single file source of truth for the whole compose instead of multiple entry points in different files.