Open djannot opened 4 years ago
I think I know what the issue is here. We re-wrote our CI/CD release pipeline for the 0.0.24 release and it looks like the VERSION
didn't get picked up by the build, so it defaulted to dev
. As such, the Istio Operator image is pulling in dev
instead of 0.0.24
. I'll cut a new release to fix the version issue.
hello 0.0.25 still has this problem
Yes, this should be fixed by #159
~Still seeing this issue after this change~
False alarm, this is indeed fixed by the 0.0.26
release!
hello 0.0.26 still has this problem.
[Envoy (Epoch 0)] [2020-08-25 03:34:34.117][23][warning][config][external/envoy/source/common/config/grpc_subscription_impl.cc:87] gRPC config for type.googleapis.com/envoy.api.v2.Listener rejected: Error adding/updating listener(s) 172.22.3.210_8000: Failed to initialize WASM code from /var/local/lib/wasme-cache/3f319eec32afdfb1c053e1aea3a665504ff9d5f5ea4019146bcb455dfaea29d1
virtualInbound: Failed to initialize WASM code from /var/local/lib/wasme-cache/3f319eec32afdfb1c053e1aea3a665504ff9d5f5ea4019146bcb455dfaea29d1
Hi @GuangTianLi, It looks like your issue is different. The original issue here was complaining about an invalid path, which was ultimately caused by the wrong version of the operator being loaded.
It looks like in your error message the WASM code is failing to initialize, but it's not complaining about invalid paths.
i am sorry , my 0.0.26 version still has this problem, can someone tell me why?
`2020-08-31T02:55:52.212398Z info Envoy command: [-c etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster details.istio-project --service-node sidecar~10.129.5.186~details-v1-5f8447ccd5-7ggl8.istio-project~istio-project.svc.cluster.local --max-obj-name-len 189 --local-address-ip-version v4 --log-format %Y-%m-%dT%T.%fZ %l envoy %n %v -l warning --component-log-level misc:error --concurrency 2] 2020-08-31T02:55:52.615490Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamAggregatedResources gRPC config stream closed: 14, no healthy upstream 2020-08-31T02:55:52.615558Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:54] Unable to establish new stream 2020-08-31T02:55:52.642499Z warning envoy main [external/envoy/source/server/server.cc:475] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections 2020-08-31T02:55:52.748311Z info sds resource:default new connection 2020-08-31T02:55:52.748396Z info sds Skipping waiting for ingress gateway secret 2020-08-31T02:55:53.554728Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamAggregatedResources gRPC config stream closed: 14, no healthy upstream 2020-08-31T02:55:53.554766Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:54] Unable to establish new stream 2020-08-31T02:55:53.563283Z info cache Root cert has changed, start rotating root cert for SDS clients 2020-08-31T02:55:53.563336Z info cache GenerateSecret default 2020-08-31T02:55:53.570992Z info sds resource:default pushed key/cert pair to proxy 2020-08-31T02:55:56.997075Z info sds resource:ROOTCA new connection 2020-08-31T02:55:56.997173Z info sds Skipping waiting for ingress gateway secret 2020-08-31T02:55:56.997203Z info cache Loaded root cert from certificate ROOTCA 2020-08-31T02:55:56.997280Z info sds resource:ROOTCA pushed root cert to proxy 2020-08-31T02:55:57.425220Z warning envoy config [external/envoy/source/common/config/grpc_subscription_impl.cc:101] gRPC config for type.googleapis.com/envoy.api.v2.Listener rejected: Error adding/updating listener(s) virtualInbound: Invalid path: /var/local/lib/wasme-cache/a515a5d244b021c753f2e36c744e03a109cff6f5988e34714dbe725c904fa917
2020-08-31T02:55:59.287257Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected 2020-08-31T02:56:01.233766Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected 2020-08-31T02:56:03.187032Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected 2020-08-31T02:56:05.176523Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected 2020-08-31T02:56:07.194144Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected 2020-08-31T02:56:09.192120Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected 2020-08-31T02:56:11.176369Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected 2020-08-31T02:56:13.176285Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected 2020-08-31T02:56:15.176349Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected 2020-08-31T02:56:17.176047Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected`
hi @pantianying @djannot
this issue was actually resolved in. https://github.com/solo-io/wasme/pull/95 but we have had some CI issues that are blocking it from getting merged, and at this point the PR needs to be updated. cc @yuval-k
a temporary workaround is to restart the target pods; envoy should eventually pick up the wasm module file
i believe the CI issues are related to an envoy bug that was only recently fixed. i.e. the approach in the PR might only work for the next istio release
hi @pantianying @djannot
this issue was actually resolved in. #95 but we have had some CI issues that are blocking it from getting merged, and at this point the PR needs to be updated. cc @yuval-k
a temporary workaround is to restart the target pods; envoy should eventually pick up the wasm module file
restart the target pods? when I use
wasme deploy istio webassemblyhub.io/pantianying/add-header:v0.0.3
And then the new POD couldn't init successfully, you mean that restart the pod that couldn't init successfully can solve this issue?
@pantianying Yes, for now restarting the pod should fix it. Unfortunately like @yuval-k mentioned we're waiting for Istio to pull in the upstream envoy fix for the issue which ultimately causes this cache race condition.
@Sodman It's still failing for me warning envoy config gRPC config for type.googleapis.com/envoy.config.listener.v3.Listener rejected: Error adding/updating listener(s) virtualInbound: Invalid path: /var/local/lib/wasme-cache/314c75ded0da28314381281e74ab8b91196055360bd7b57f132de21c2116b9a3
And then my pod crashes PostStartHookError: command 'pilot-agent wait' exited with 255: Error: timeout waiting for Envoy proxy to become ready. Last error: HTTP status code 503
Wasme version 0.0.32 Istio version 1.8.2 Kubernetes 1.18.6
I am able to make it work on kind
in my local though, but doesn't work on our on-prem cluster. So I am not sure where to start debugging this
Hi @harpratap, do the logs from the wasme
pod (which manages the cache), give any more insight? If the cache didn't pull the image correctly (could be an HTTP error) it's possible it never cached it, which would explain why it didn't get loaded into the proxy. If this is the case, you could try bouncing the cache pod to force a refresh.
I guess that it can be resolved by delete the po in wasme namespace. laughing
I've followed this guide: https://docs.solo.io/web-assembly-hub/latest/tutorial_code/deploy_tutorials/deploying_with_istio/
And was able to deploy on my cluster running Istio 1.6.7, but then I got this error on all the Pod from the
istio-proxy
container:The same filter works when I deploy it on Gloo.