Closed james-milligan closed 1 year ago
I have begun research on the connect client and have put together a POC to demonstrate it. Whilst it is a new project I believe that it is a strong contender for a quick and effective improvement to the 'flagd for client' offering.
The connect-go
library allows for a http(s)/gRPC server to be opened through a single port (similar to the current usage of grpc-gateway
) whilst also offering a new connect-protocol allowing for a lightweight web based client to be used to communicate with flagd.
This may present the opportunity to merge all 3 services to a single set of handlers, reducing code duplication and complexity (no more cmux
juggling) and allowing the service to be run via a single goroutine within the Serve
method. It will also remove the need for the use of an envoy proxy, which should improve the performance of the service.
Shouldn't clientside evaluation be done by another implementation? flagD is specifically designed as a binary to run on a server somewhere.
Apologies if I have misunderstood your question, however, flagD would still be run as a binary on a server somewhere, but with additional configuration options to allow for a developer to apply it to a wider range of use cases. Such as allowing a low overhead alternative to http for communication directly between the client and evaluation server, or the implementation of authentication for the publicly exposed interface. Flagd would still be running server side, but a client would be able to communicate with it directly and without the need for an additional hop to an API server, or duplicating flags between a client and server implementation.
If this is clearer let me know and ill update my original comment
Yep, to put it another way, we want a web OpenFeature SDK (something that doesn't exist yet, but isn't far off) to be able to register a flagd provider to evaluate flags in a web client without having all evaluation done on the server side.
From a flagd provider perspective, this means optimizations around caching and reducing http traffic. From the flagd perspective requires some means of exposing flagd publicly.
Such as allowing a low overhead alternative to http for communication directly between the client and evaluation server
Okay, I understand we could do some optimisations and perhaps explore the single port option you mentioned.
Implementation of authentication
Authentication comes with authorization, two opinionated and tightly coupled concepts. If we are going to start using JWT's with scopes and Oauth2 implementations flagD is going to explode in complexity. Do we have a preference on OIDC?
We would also want to think about caching, expiration monitoring, traffic throttling and middleware components that don't currently exist.
I am not opposed to doing that work, but it feels to me that if we enabled the re-use of flagD as a library you could enable that work without forcing Kubernetes users to take those dependencies and codebase additions.
It would be useful for me to understand the motivation, use-case, live opportunity that this would enable. That might also help others speak a little on this topic.
WRT AuthN/Z, I think we can keep it very simple. OAuth2 seems like overkill. As long as the resources flagd exposes are readonly and insensitive (so far they are), I think we can avoid it.
Other vendors use a simple publicly distributed key that's sent to the browser, and it's really used to simply scope flags to a particular account to prevent collisions in flag names. Flagd isn't multi-tenant, so even that concern doesn't apply to us. Flagd also don't push it's configuration down to clients for them to evaluate, so even hypothetical sensitive data in rules would never be sent over the wire.
Simply annotating some flags as being NOT publicly render-able may be sufficient, I think.
It would be useful for me to understand the motivation, use-case, live opportunity that this would enable. That might also help others speak a little on this topic.
Vendors have reported that about 30% off all their flag evaluations happen on their js web clients. Obviously there's a large class of flags that impact backend behavior, but the aforementioned 30% are often about UX, layout, etc... These are frequently implemented entirely in the web client when a new feature flag impacting just the web app is added.
For this reason, some means of exposing flagd directly to the browser is valuable.
Having multiple configurable auth strategies makes sense. OAuth sounds like a complex initial target where you could provide support for bearer tokens quickly through the configuration file or an environment variable
The CIA triad is a good model for asking basic questions about the security requirements of a system. Let's do that in terms of flagd (and more generally the evaluation portion of most feature flag systems):
(C)onfidentiality: flag data is not confidential. The values returned aren't secrets. Rule definitions are not returned, just flag values. User data sent in the context to the flag system could contain PII, and generally should be protected. This is accomplished by encrypting communication from the client.
(I)ntegrity: flagd's evaluation API is read only. Integrity cannot be compromised because the system can only be mutated via internal configuration, not the flag evaluation API.
(A)vailability: Availability is important; availability of flagd must be maintained under high load or attacks designed to strain the system. Authentication in flagd doesn't help us much here. Parsing a bearer token or a JWT or an auth header in flagd would still leave us open to the most significant type of attacks, like syn-floods, http floods, etc. Valid credentials aren't needed to perform these - an invalid auth header is still HTTP traffic that must be processed. These attacks are necessarily mitigated with infrastructure before they even get to the application.
I think all of this is why flag vendors generally don't do authentication or authorization in their client evaluation APIs. They do identification - they pass a token that identifies the tenant to the web client with read-only privileges. The token is not secret (https://docs.launchdarkly.com/sdk/concepts/client-side-server-side#client-side-id, https://help.split.io/hc/en-us/articles/360019916211). I'm not even sure we need to this this, since flagd is not multi-tenant.
TLDR: I think doing any form of AuthN/Z is not necessary with our current iteration of flagd. Steps need to be taken to secure flagd communication, but AuthN/Z is probably not one of them. To reassure you, here are the front-end flags being used by Netlify (they use LaunchDarkly) lifted right out of their UI code: https://clientstream.launchdarkly.com/eval/5e42e94d6bd2ec08061adf53/eyJrZXkiOiJhbm9ueW1vdXMtcXVva2thIiwiZW1haWwiOiIifQ
Thanks for the robust conversation so far. I think it's worth restating at this point what the goals implied in this issue are. If are we not looking to implement an authentication layer, is the real work here to add some bells onto the HTTP sync service ( possibly tidy it up with the grpc service ) and modify to support confidential flags @james-milligan ? A list of the three key development features in the debate would be really useful.
Based upon @toddbaert's comment it sounds a little early in the day for AuthN/Z, however I believe it will still be worthwhile improving the availability / performance of the web client => flagd interface, as a result I believe the key development features here would be:
I have already POCd the use of the connect protocol, and I believe it meets all the requirements for web => flagd communication, however, if we wished to switch to this library to also serve the gRPC / HTTP interface the routes used by flagd would be updated, these endpoints are not consumer facing so shouldn't be an issue, but would require an update to the existing providers.
http://host:port/flags/myObjectFlag/resolve/object
becomes
http://host:port/schema.v1.Service/ResolveBoolean
with all fields being passed in the JSON request body
Sounds like points 1/2 are nice feature enhancements. As for high availability and caching, I think as you said, these are not in the purview of flagD and should be taken care of elsewhere ( especially HA ).
FlagD shouldn't need to be changed to support client-side feature flagging.
Discussions are required as to the approach we wish to take tailoring Flagd to better enable flag evaluation for client side projects.
At present the development has been targeted towards server side with a large focus on Kubernetes deployments. Additional functionality will be required to make the project more appropriate for use in web and mobile environments. An example of this is the possible introduction of an auth token system.
Another consideration are the clients used to communicate with Flagd. An example issue here would be the use of grpc-web, which is an expensive client to use, an alternative could be to use a connect client/server setup as an additional option for the service flag.
There will certainly be further possible enhancements to further improve the Flag for web / mobile offerings, opening this issue to get the ball rolling!