launchdarkly / ld-relay

LaunchDarkly Relay Proxy
Other
112 stars 80 forks source link

Stop invalid environment keys from preventing relay start #73

Closed nagibyro closed 5 years ago

nagibyro commented 5 years ago

Is your feature request related to a problem? Please describe. It'd be nice if there was a configuration option to allow the relay to continue startup even when some of the environment SDK keys are invalid.

The situation we ran into at my company was that we were using the relay for multiple projects about 10 right now but we expect it to grow maybe even to about 100 or more. One of the teams either removed an environment or reset their keys which on the next restart of our relay caused:

ERROR: Received HTTP error 401 (invalid SDK key) for streaming connection - giving up permanently
ERROR: 2019/08/23 10:24:50 relay.go:555: Error initializing LaunchDarkly client for **********: LaunchDarkly client initialization failed

which then broke the relay for all other clients. This was in a non prod environment but got us worried about deploys to prod. We're looking at adding a check in CI to make sure all the keys are valid but this would only be during our deployment of the relay. It wouldn't help if the service was cycled for any reason on the instance.

It'd be great if there was a cli options that we could enable to have the relay start even with invalid keys. So that we can protect against teams that need to reset or change sdk keys, create or delete environments, from breaking other apps that rely on the relay.

Describe the solution you'd like Add a cli options such as --allow-invalid-keys which would would emit warnings when failing to connect to launch darkly but would not stop other environments in the config where the keys do work. Adding these warnings as metrics that can be sent out to datadog ect... would be good for monitoring too.

eli-darkly commented 5 years ago

Hmm. It was my understanding that the default behavior of Relay is already what you're suggesting; it's not supposed to cancel startup when an individual environment fails, unless you set the ExitOnError property.

Could you be more specific about the behavior you saw? You said it "broke the relay for all other clients": do you mean that it exited, or it kept running but did not accept connections for the other environments, or it did accept connections but behaved incorrectly in some other way?

nagibyro commented 5 years ago

Ah didn't see the exitOnError options. It is set to true for us. My mistake! Thanks for your help @eli-darkly