rancher / dashboard

The Rancher UI
https://rancher.com
Apache License 2.0
450 stars 256 forks source link

Limit Resources Watched on Norman Websockets #7906

Open KevinJoiner opened 1 year ago

KevinJoiner commented 1 year ago

Setup

Describe the bug

Currently, The Dashboard opens a WebSocket to Norman to track resource changes using v3/subscribe. Rancher will start a watch command to k8s for 77 different CRDs to support this request. The Dashboard may only need change event information for some of these resources. Thus extra resources are being consumed by Rancher to watch CRDs that are not used. As well as extra resources needed by the Dashboard to handle resource changes that are not needed. The Dashboard can limit updates to only include required CRDS with the query parameter resourceTypes. For example, to watch authconfig and users, the query would be /v3/subscribe?resourceType=authconfig&resourceType=user.

To Reproduce

  1. Start a new Rancher Server
  2. Inspect the Norman Websocket traffic /v3/subscribe
  3. Navigate to the Global Settings page
  4. Change an arbitrary setting such as password-min-length from 12 -> 11

Result Setting changes is sent on the WebSocket

Expected Result Only changes for specific resources are sent on the WebSocket

Additional context The following CRDs are watched by default for a Norman Websocket with no query parameters.

projectCatalog, groupMember, roleTemplate, oidcConfig, clusterMonitorGraph, cisBenchmarkVersion, user, cloudCredential, node, shibbolethConfig, managementSecret, nodeDriver, clusterScan, preference, authConfig, template, clusterRegistrationToken, kontainerDriver, rkeAddon, clusterAlertGroup, clusterRoleTemplateBinding, podSecurityPolicyTemplateProjectBinding, composeConfig, cisConfig, rancherUserNotification, samlToken, clusterAlert, globalDnsProvider, freeIpaConfig, cluster, adfsConfig, project, activeDirectoryConfig, localConfig, catalogTemplateVersion, userAttribute, projectNetworkPolicy, projectAlert, podSecurityPolicyTemplate, pingConfig, projectAlertRule, clusterAlertRule, token, setting, dynamicSchema, clusterTemplateRevision, githubConfig, notifier, projectAlertGroup, projectRoleTemplateBinding, keyCloakConfig, etcdBackup, openLdapConfig, rkeK8sSystemImage, globalRoleBinding, rkeK8sServiceOption, multiClusterApp, catalog, nodeTemplate, templateContent, keyCloakOIDCConfig, clusterTemplate, group, azureADConfig, catalogTemplate, googleOauthConfig, oktaConfig, globalDns, globalRole, clusterCatalog, multiClusterAppRevision, fleetWorkspace, feature, nodePool, projectMonitorGraph, templateVersion, monitorMetric

richard-cox commented 1 year ago

There's an initial investigation on removing norman entirely (see confluence). That lists what norman resources we use what their subscription affects. Note - that list doesn't include auth config for specific providers from above that we should also include (oidcConfig, keyCloakOIDCConfig, keyCloakConfigetc, etc). There may be others i've missed so need to review.

This would be a tiny change, large test but great performance improvement. We just need to be super careful about the list of resources to subscribe to.

As part of this we should also not send watch/unwatch messages to the norman socket

gaktive commented 1 year ago

@richard-cox is this is still relevant for the current work on peformance improvements? Otherwise, can we push this out to Q3?

richard-cox commented 1 year ago

@gaktive This would be really good to do performance wise (frontend, but mostly backend). There's a medium level of risk in that we need to ensure no resource is missed. It wouldn't take long to implement.

However it's not linked to anything we've specifically committed to delivering and could be a candidate to bump.

moio commented 11 months ago

Potentially related: https://jira.suse.com/browse/SURE-6952

gaktive commented 10 months ago

We will push back on this once we know the future of the Norman API but we will focus on the Steve API first.