Open cpinjani opened 3 months ago
Validation passed on build, EndpointAccessUpdate is applied successfully and not getting reverted.
Rancher - v2.9-4814b506835d6118024dd3141bf2465bdafbb0f3-head
eks-operator - v1.9.3-rc.1
Logs:
time="2024-10-02T19:05:57Z" level=info msg="Updating public access to true and private access to true for cluster [cpinjani-eks6 (id: c-95ts8)]"
time="2024-10-02T19:05:59Z" level=info msg="Updating public access to true and private access to true for cluster [cpinjani-eks6 (id: c-95ts8)]"
time="2024-10-02T19:06:00Z" level=info msg="Cluster [cpinjani-eks6 (id: c-95ts8)] finished updating"
time="2024-10-02T19:06:00Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:06:00Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:06:30Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:07:00Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:07:31Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:08:01Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:08:31Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:09:01Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:09:32Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:10:02Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:10:32Z" level=info msg="Waiting for cluster [cpinjani-eks6 (id: c-95ts8)] to finish updating"
time="2024-10-02T19:11:03Z" level=info msg="Cluster [cpinjani-eks6 (id: c-95ts8)] finished updating"
.
.
.
time="2024-10-02T19:18:48Z" level=info msg="Updating public access to false and private access to true for cluster [cpinjani-eks6 (id: c-95ts8)]"
time="2024-10-02T19:18:49Z" level=info msg="Updating public access to false and private access to true for cluster [cpinjani-eks6 (id: c-95ts8)]"
time="2024-10-02T19:18:50Z" level=info msg="Cluster [cpinjani-eks6 (id: c-95ts8)] finished updating"
The issue still exists.
Operator version: rancher/eks-operator:v1.9.3-rc.2
Rancher: v2.9-99b2583e3370321a922e29a11ac5ff9f845baeb6-head
eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:01:19Z" level=info msg="Updating public access to false and private access to true for cluster [pvala-eks-again (id: c-2nzpf)]" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:01:28Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:01:29Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:02:00Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:02:30Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:03:01Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:03:32Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:04:03Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:04:34Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:05:04Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:05:35Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:06:06Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:06:18Z" level=info msg="Bringing up vpc" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:06:37Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:06:51Z" level=info msg="Creating service role" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:07:08Z" level=info msg="Waiting for cluster [pvala-eks-again (id: c-2nzpf)] to finish updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:07:20Z" level=info msg="Waiting for cluster [pvala-eks-gpu (id: c-tq8nt)] to finish creating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:07:42Z" level=info msg="Updating public access to true and private access to false for cluster [pvala-eks-again (id: c-2nzpf)]" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:07:46Z" level=info msg="Updating public access to true and private access to false for cluster [pvala-eks-again (id: c-2nzpf)]" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:07:47Z" level=info msg="Cluster [pvala-eks-again (id: c-2nzpf)] finished updating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:07:51Z" level=info msg="Waiting for cluster [pvala-eks-gpu (id: c-tq8nt)] to finish creating" │
│ eks-config-operator-8dd669847-2csl5:time="2024-10-14T10:07:51Z" level=info msg="Updating public access to true and private access to false for cluster [pvala-eks-again (id: c-2nzpf)]"
It updates to the desired config, but not always. It is very random, sometimes it works and sometimes reverts the config. It only worked for me on the third try.
It happens with both imported and provisioned clusters.
@valaparthvi I see below results on eks-operator:v1.9.3-rc.2 The desired spec gets applied, it seems to reverts back and gets re-applied again (We can file separate issue for this) Final spec is desired one.
Logs
time="2024-10-14T11:17:24Z" level=info msg="Updating public access to false and private access to true for cluster [cpinjani-eks (id: c-7bgv9)]"
time="2024-10-14T11:17:26Z" level=info msg="Updating public access to false and private access to true for cluster [cpinjani-eks (id: c-7bgv9)]"
time="2024-10-14T11:17:27Z" level=info msg="Cluster [cpinjani-eks (id: c-7bgv9)] finished updating"
time="2024-10-14T11:17:28Z" level=info msg="Updating public access to false and private access to true for cluster [cpinjani-eks (id: c-7bgv9)]"
time="2024-10-14T11:17:50Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:17:50Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:18:20Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:18:51Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:19:21Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:19:51Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:20:22Z" level=info msg="Updating public access to true and private access to true for cluster [cpinjani-eks (id: c-7bgv9)]"
time="2024-10-14T11:20:23Z" level=info msg="Updating public access to true and private access to true for cluster [cpinjani-eks (id: c-7bgv9)]"
time="2024-10-14T11:20:24Z" level=info msg="Cluster [cpinjani-eks (id: c-7bgv9)] finished updating"
time="2024-10-14T11:20:25Z" level=info msg="Updating public access to true and private access to true for cluster [cpinjani-eks (id: c-7bgv9)]"
time="2024-10-14T11:22:50Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:22:50Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:23:21Z" level=info msg="Updating public access to false and private access to true for cluster [cpinjani-eks (id: c-7bgv9)]"
time="2024-10-14T11:23:23Z" level=info msg="Updating public access to false and private access to true for cluster [cpinjani-eks (id: c-7bgv9)]"
time="2024-10-14T11:23:23Z" level=info msg="Cluster [cpinjani-eks (id: c-7bgv9)] finished updating"
time="2024-10-14T11:23:23Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:23:24Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:23:54Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:24:24Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:24:54Z" level=info msg="Waiting for cluster [cpinjani-eks (id: c-7bgv9)] to finish updating"
time="2024-10-14T11:25:25Z" level=info msg="Cluster [cpinjani-eks (id: c-7bgv9)] finished updating"
@valaparthvi I see below results on eks-operator:v1.9.3-rc.2 The desired spec gets applied, it seems to reverts back and gets re-applied again (We can file separate issue for this) Final spec is desired one.
Interesting. I will retest this again and file another one. Thanks for testing.
I tested by updating Public Source config, it did not revert to the desired config. I waited almost 20 min but the changes did not revert. Though, it did work on the second try.
time="2024-10-15T08:01:57Z" level=info msg="Updating public access source config to [0.0.0.0/0 49.37.240.101/32] for cluster [pvala-eks-tbimported (id: c-cvrqh)]"
time="2024-10-15T08:01:58Z" level=info msg="Updating public access source config to [0.0.0.0/0 49.37.240.101/32] for cluster [pvala-eks-tbimported (id: c-cvrqh)]"
time="2024-10-15T08:01:58Z" level=info msg="Cluster [pvala-eks-tbimported (id: c-cvrqh)] finished updating"
time="2024-10-15T08:01:59Z" level=info msg="Updating public access source config to [0.0.0.0/0 49.37.240.101/32] for cluster [pvala-eks-tbimported (id: c-cvrqh)]"
time="2024-10-15T08:02:14Z" level=info msg="Waiting for cluster [pvala-eks-priv-only (id: c-v9hnw)] to finish updating"
time="2024-10-15T08:02:45Z" level=info msg="Cluster [pvala-eks-priv-only (id: c-v9hnw)] finished updating"
time="2024-10-15T08:02:47Z" level=info msg="Waiting for cluster [pvala-eks-tbimported (id: c-cvrqh)] to finish updating"
time="2024-10-15T08:02:47Z" level=info msg="Waiting for cluster [pvala-eks-tbimported (id: c-cvrqh)] to finish updating"
time="2024-10-15T08:03:18Z" level=info msg="Updating public access source config to [0.0.0.0/0] for cluster [pvala-eks-tbimported (id: c-cvrqh)]"
time="2024-10-15T08:03:19Z" level=info msg="Updating public access source config to [0.0.0.0/0] for cluster [pvala-eks-tbimported (id: c-cvrqh)]"
time="2024-10-15T08:03:20Z" level=info msg="Cluster [pvala-eks-tbimported (id: c-cvrqh)] finished updating"
time="2024-10-15T08:03:21Z" level=info msg="Waiting for cluster [pvala-eks-tbimported (id: c-cvrqh)] to finish updating"
time="2024-10-15T08:03:21Z" level=info msg="Waiting for cluster [pvala-eks-tbimported (id: c-cvrqh)] to finish updating"
time="2024-10-15T08:03:51Z" level=info msg="Cluster [pvala-eks-tbimported (id: c-cvrqh)] finished updating"
@cpinjani @valaparthvi it looks like race condition, which is not generating any error.s Changes to EKS endpoint are done only at the beginning of the cluster and then this option is not changing eventually once during whole cluster exploration. Usually none is changing multiple options at once, but even then valid solution is to reapply changes which will fix this problem. I think we are loosing too many resources on this issue, which is not important.
I don't disagree with you, Michal. But since this is still an issue, would it be okay to keep this issue around to be fixed at a later stage?
yup, it will be reworked once again later on, back to backlog
Rancher version:
Cluster Type: Downstream EKS cluster
Describe the bug: EndpointAccessUpdate & LoggingUpdate getting reverted back sporadically, probably due to race condition.
For example: c756d5c6-eca7-3d88-99e8-3e3087078822 - Initial update 5a3cc622-d06d-3fa8-890b-3920ca1c8064 - Reverted back
Steps
Logs:
PR's: