l7mp / stunner

A Kubernetes media gateway for WebRTC. Contact: info@l7mp.io
https://l7mp.io
MIT License
709 stars 56 forks source link

cds-client ERROR: failed to init CDS watcher #110

Closed vipcxj closed 8 months ago

vipcxj commented 8 months ago
cds_client.go:59: cds-client ERROR: failed to init CDS watcher (url: ws://10.16.0.9:13478/api/v1/configs/stunner/stunner-study-ai?watch=true): websocket: bad handshake

What's wrong with me? I have restart stunner-gateway-operator-controller-manager, but issue not disappeared.

I found stunner is updated two days ago, and the version of docker.io/l7mp/stunnerd created by stunner-operator is latest. Perhaps it cause the bug. So I try to update the stunner-operator, but it failed., it said:

ERROR   setup   problem running operator    {"error": "cannot register gatewayconfig controller: no matches for kind \"GatewayConfig\" in version \"stunner.l7mp.io/v1\""}

So I modified my stunner config. But I can't deploy them, because:

Error: UPGRADE FAILED: [resource mapping not found for name: "stunner-study-ai" namespace: "stunner" from "": no matches for kind "Gateway" in version "gateway.networking.k8s.io/v1"
ensure CRDs are installed first, resource mapping not found for name: "stunner-study-ai" namespace: "" from "": no matches for kind "GatewayClass" in version "gateway.networking.k8s.io/v1"
ensure CRDs are installed first, resource mapping not found for name: "stunner-study-ai" namespace: "stunner" from "": no matches for kind "GatewayConfig" in version "stunner.l7mp.io/v1"
ensure CRDs are installed first]

It seems like a chicken-and-egg problem. I think the version should not be latest.

At the end, I manually delete all crds stunner created, and then uninstall the operator. Now I reinstall the operator, the crds are updated, so error disappeared. And the first error disappeared as well. But I found ephemeral auth still not work. To see the detail, please check my another issue

Though I resolve this issue myselves, I think it really a bug, 'latest' version cause incompatible version, and reinstalling operator not update the crds, causing operator not work.

rg0now commented 8 months ago

Thanks for the report. This is caused by a bug in our Helm charts. As per the auth-problem, let's handle that at https://github.com/l7mp/stunner/issues/102.

vipcxj commented 8 months ago

Thanks for you quick response. There is another problem. I have tried update the operator using command:

helm repo update stunner
helm upgrade -i stunner-gateway-operator stunner/stunner-gateway-operator --create-namespace \
    --namespace=stunner-system

But crds are not update, so trigger my second problem. Then I tried uninstall first:

helm uninstall --namespace stunner-system stunner-gateway-operator
kubectl delete namespace stunner-system

Then resinstall it. but still not work. At last I create a crds.yaml, and copy all crds from source to it, then using:

kubectl delete -f crds.yaml

Then reinstall the operator, it works. Is it a issue of k8s, or the issue of stunner helm?

rg0now commented 8 months ago

Unfortunately, Helm and CRDs are known to cause problems. Hope this is resolved now.

rg0now commented 8 months ago

Fixed in https://github.com/l7mp/stunner-helm/commit/d84b26e8efc1d0d7b03590a67aab4f90d5d33331