solo-io / gloo

The Feature-rich, Kubernetes-native, Next-Generation API Gateway Built on Envoy
https://docs.solo.io/
Apache License 2.0
4.07k stars 437 forks source link

~20min to update the resource status(Accepted) #8655

Open totallyGreg opened 1 year ago

totallyGreg commented 1 year ago

Gloo Edge Product

Enterprise

Gloo Edge Version

v1.13.11

Kubernetes Version

Not sure what version.

Describe the bug

Customer deployed roughly 4000 resources and ran into limits set on CPU and memory.
Status update to Accepted took ~20minutes

[[2023-08-30]] Status is blank and takes a while to update Increased limits to 3gb memory 5cpu ( no more OOM kill ) Removing CPU limits reduced time to ~5minutes to show resources accepted.

disableKubernetesDestinations: true is set

Load is significantly less (90%) than previous with limits set Customer removed the 4000 staged resources but time to update status for new resource to accepted is still ~5minutes

Expected Behavior

Apply large set of resources should be processed quickly.

Removing this large set of resources should return processing speed to normal.

Steps to reproduce the bug

  1. Generate the large amount of VS,US,RT resources
  2. kubectl apply -f resources
  3. notice time to update status significantly reduced for the resources added.

Additional Environment Detail

Customer is using custom operator to process Gateway API resources into Gloo Upstreams, RouteTables, VirtualServices

Additional Context

No response

soloio-bot commented 1 year ago

Zendesk ticket #2595 has been linked to this issue.

totallyGreg commented 1 year ago

Ticket provides lots of submitted resources.

Customer updated ticket today:

we observed another issue, gloo route table state some time returns as numeric value 1 rather string "Accepted" causing our operator failed to unmarshal gloo route table .After all resource accepted by gloo getting status as "Accepted". Attached the log message RT-state_1.log

soloio-bot commented 11 months ago

Zendesk ticket #2595 has been linked to this issue.

github-actions[bot] commented 3 months ago

This issue has been marked as stale because of no activity in the last 180 days. It will be closed in the next 180 days unless it is tagged "no stalebot" or other activity occurs.