kmesh-net / kmesh

High Performance ServiceMesh Data Plane Based on Programmable Kernel
https://kmesh.net
Apache License 2.0
464 stars 70 forks source link

TestRemoveAddNsOrServiceWaypoint flake #704

Closed hzxuzhonghu closed 3 months ago

hzxuzhonghu commented 3 months ago

https://github.com/kmesh-net/kmesh/actions/runs/10294849725/job/28493564917?pr=679

YaoZengzeng commented 3 months ago

It seems like that the Kmesh daemon crashed during the test, casuing the newly deployed waypoint fail to take effect.

hzxuzhonghu commented 3 months ago

How do you see kmesh crash from the log?

YaoZengzeng commented 3 months ago

How do you see kmesh crash from the log?

I run e2e tests locally and watch pods in the cluster.

And the default call in e2e test has a retry mechanism, as long as the retry succeeds in 30s, unless you explicitly specifty NoRetry. ref: https://github.com/istio/istio/blob/master/pkg/test/framework/components/echo/common/call.go#L65

Therefore, there will be no problem starting the test immediately after add the using waypoint label.

I think this test case will not have flaky issue after #706 is merged. We can continue to observe.