Closed CarsonCook closed 2 years ago
Keep in mind we need to support the HA setup here.
Depends on #1359
We heard from multiple extenders and from core services that they have a case: I want to talk to one particular instance There are different motivations for this, we heard: Because the instance is the one i need to talk to (a specific system, specific console ...) Because there might be state that does not get distributed around to other instances (session)
Joe’s Tomcat server with 100 Java threads. When a user logs in, they keep a thread in anticipation of the user coming back. One user can have a lots of threads across multiple instances of the one user. The user can log off, when the user logs off, possibly free the thread. zOSMF actually has the same issue. The TSO session is actually long-lived and use the same optimization. Mainframe workloads related to development or CICD tend to be isolated to particular LPAR’s
Load balancing (LB/DLB) Rate Limiting (RL/DRL) User Limiting - we need more information
I would argue that Rate Limiting, while useful for obvious reasons, should not be part of our MVP. Both have value on their own. One does not require the other. It seems that we are in agreement that Rate Limiting can be sacrificed.
Based on a token that represents client (apiml auth cookie), we can recognize a client and provide deterministic routing. That means we would store where the client has been routed in past and distribute this knowledge between Gateways. Upon next request from the client, we recognize him by the token and route the request to the previously routed server. Positives: Without client’s interaction Negatives: We have to store, resolve and lifecycle the session Client must be identifiable
When any client gets routed, we would return (cookie, header) an information, where the request routed to. On next call, the client can provide a token (cookie, header) and request the same instance as last time. Gateway will see this request and route as requested.
Benefits: Less code to break Does not suffer from synchronization issues across Gateways Works also for unauthenticated requests Client can choose what instance he wants
Negatives: Client has to take action (could be alleviated by using cookies) Does not carry the rate limiting capabilities
. . .
When user authenticates for instance, the token will dictate where the users get routed in predictable fashion
Positives: Client does not need to take action
Negatives: How to manage changing services?
Identifying the user/session: Token IP
Transferring the session Cookie Header
Configuration Default (off) Service can say what it wants How deep do we want to load balance? (ServiceId <-> path, Composite API’s like zosmf)
Transferability of solution to SC Gateway
Model rejection strategy Reject
Security of headers Header spoofing
@jandadav the extender has confirmed the client based solution will work for them.
Proposal for follow-up stories to finish the load balancing implementation
As a Zowe conformant application developer I can Configure the load balancer for my service with predefined load balancing schemas So that I can Achieve the load balancing scheme that is desirable for my application
This will mean to implement: PredicateFactory that is aware of the service's registration metadata Enhances the context's Environment with the metadata Constructs the load balancing beans conditionally
As a Zowe conformant application developer I can Call my application's API with Zowe authentication through single instance of API Gateway and always get to the same instance of my service for a given period of time. So that I can Protect against additional user-related address spaces spawned by my application without changing its code.
This will mean to implement: A balancing bean that: Recognizes requests by Zowe authentication - User. User has multiple JWT's so we have to understand who is calling. Unauthenticated requests? - not sure if it's universal, Carson will check with the extender (pervasive or restrictive) If there is no preference, routes the request to round robin and stores preference. If there is preference, routes the requests to the same instanceId as the preference Lifecycle: Expiry of preference after configurable time period is exceeded since last request
As a Zowe conformant application developer I can Call my application's API with Zowe authentication through any instance of API Gateway and consistently get to the same instance of my service for a given period of time. So that I can Protect against additional user-related address spaces spawned by my application without changing its code. And I can do that against any Gateway instance and get consistent behavior.
This will mean to implement: Whatever was developed for the previous story will have to be stored in caching service
The current state of implemented infrastructure looks like this:
Is your feature request related to a problem? Please describe. Services want to handle user requests from the same instance of that service, but the gateway only balances workloads in a round robin fashion.
Describe the solution you'd like Deterministic routing for user requests, where each request with a user ID goes to the same service instance it originally was routed to. Additionally, a configurable limit for the number of users each instance can handle.
MVP see the outline in the discussion. The following is outdated.
Extensions
Additional context @dmcknight