launchdarkly / ruby-server-sdk

LaunchDarkly Server-side SDK for Ruby
https://docs.launchdarkly.com/sdk/server-side/ruby
Other
36 stars 53 forks source link

Flag Evaluation Issues in Rails Application #297

Closed sunnybogawat closed 3 weeks ago

sunnybogawat commented 2 months ago

Is this a support request? We are encountering an issue with LaunchDarkly feature flag evaluations in our Ruby on Rails application. The problem specifically occurs when the feature flag value is changed. The new value is not recognized by the Rails server until we restart the server.

Detailed Information: Environment: reproduced in any non development environment. For development it works as expected

Problem: Feature flags are not evaluated correctly after their value is changed. This discrepancy persists until the Rails server is restarted. Observation: The issue is isolated to the Rails application context. When evaluating the same feature flags in the Rails console (using rails c), the evaluations work as expected and reflect the correct, updated flag values immediately.

Describe the bug When calling LaunchDarkly.client.all_flags_state(user_obj) followed by flag_value(feature_name) in the Rails app, the flag value does not reflect the current state (e.g., feature is turned ON but returns incorrect value). However, after restarting the Rails server, the correct flag value is returned. Running the same logic in rails c (Rails console) works as expected, without needing to restart anything. The issue is only observed in the Rails app during regular requests, and does not occur in the Rails console.

To reproduce Launch the Rails server (with no prior restart). Toggle a feature flag (turn ON/OFF). Call LaunchDarkly.client.all_flags_state(user_obj) followed by flag_value(feature_name) in the Rails app and observe the incorrect flag value. Restart the Rails server and call the same function again — observe that the correct flag value is now returned. Run the same logic in the Rails console (rails c) and observe the correct behavior without restarting the server. Notes:

We suspect this might be related to caching, threading, or client initialization after the Rails 6.1 upgrade, but we haven't identified the root cause. No caching or threading issues are apparent, and the user object passed to all_flags_state(user_obj) seems consistent across requests. Would appreciate any insights into why this behavior might be happening, or guidance on how to further debug this issue. Expected behavior A clear and concise description of what you expected to happen.

Logs If applicable, add any log output related to your problem.

SDK version LaunchDarkly Ruby Client Version: [5.8.0]

Language version, developer tools Ruby on Rails Version: 6.1

OS/platform For instance, Ubuntu 16.04, Windows 10, or Android 4.0.3. If your code is running in a browser, please also include the browser type and version.

Additional context Add any other context about the problem here.

keelerm84 commented 2 months ago

I'm sorry to hear you are experiencing this issue.

The SDK works by creating multiple threads upon initialization -- one for receiving flag updates, and another for sending event data. Typical Rails deployments use a forking model where the Rails app starts up, then workers are spawned from there. Those workers receive a copy of all the memory the parent process has, but it does NOT get a copy of any spawned threads.

The effect of this is that the client will seemingly never receive updates. Please refer to the Initialize the client while using a Rails application section of our docs for common solutions.

LaunchDarkly Ruby Client Version: [5.8.0]

This version of the SDK reached EOL in January of 2022. We recommend updating to the latest supported version (v8.7.0) to ensure you are receiving all security fixes and new feature releases.

sunnybogawat commented 2 months ago

I'll attempt to improve the situation by following your recommendations and updating the version.

keelerm84 commented 2 months ago

Please let me know if this resolves your issue, or if there is additional assistance we can provide. Thank you!

sunnybogawat commented 2 months ago

Our current implementation using version 5.8.0 of the launchdarkly-server-sdk has been analyzed.

Below is my LD client initialisation code

module LaunchDarkly
  def self.client
    config = if Rails.env.test?
               file_source = Rails.root.join('spec', 'fixtures', 'launch-darkly-test-flags.json')
               {
                 data_source: LaunchDarkly::FileDataSource.factory(paths: [file_source], auto_update: true),
                 send_events: false
               }
             else
               {
                 connect_timeout: 10,
                 read_timeout: 10,
                 stream: true
               }
             end

    @client ||= LaunchDarkly::LDClient.new(Settings.launch_darkly[:sdk_key], LaunchDarkly::Config.new(config))
  end

  def self.close_client
    @client&.close
  end
end

The provided code is a wrapper that checks the value of a feature flag. When the flag is turned on, the function feature_permission_enabled? continues to return false until the server is restarted. Upgrading to Rails 6.1 and using Ruby 2.6.6 has led to the observed behavior. Is there anything that needs to be changed or updated to address this problem?

module LaunchDarklyWrapper
  include LaunchDarkly
  def self.feature_permission_enabled?(subdomain, uuid, feature_name)
    feature_value(subdomain, uuid, feature_name).present?
  end

  def self.feature_value(subdomain, uuid, feature_name)
    user_obj = { key: "#{subdomain}:#{uuid}", custom: { selectedEntityId: subdomain } 
    all_states = LaunchDarkly.client.all_flags_state(user_obj)
    return false unless all_states.valid?
    all_states.flag_value(feature_name)
  end
end
keelerm84 commented 2 months ago

That initialization code is instantiating the client once and then re-using that connection, which is good. However, I can't tell solely from the information you've provided if that instantiating could be happening before the rails server finishes initializing.

It is definitely possible that you are calling your LaunchDarkly.client method early enough in the process that this is still occurring.

Instead of instantiating the client through this means, can you instead instantiate it as part of the rails initialization framework (as suggested in the docs I linked). Then you can check if the values are changing as a result of that bootstrapping process. If they are, then we know the problem is with your module method being called too early in the process.

This next part is unrelated to the problem at hand, but I wanted to offer some advice based on your code snippet:

  def self.feature_value(subdomain, uuid, feature_name)
    user_obj = { key: "#{subdomain}:#{uuid}", custom: { selectedEntityId: subdomain } 
    all_states = LaunchDarkly.client.all_flags_state(user_obj)
    return false unless all_states.valid?
    all_states.flag_value(feature_name)
  end

The point of this method is to determine the results of a single flag evaluation. But you are doing this by calling the all_flags_state method, which evaluates EVERY flag in your environment for that context.

Instead, you should be able to just call our underlying variation method for that one single flag.

  def self.feature_value(subdomain, uuid, feature_name)
    user_obj = { key: "#{subdomain}:#{uuid}", custom: { selectedEntityId: subdomain } 
    return LaunchDarkly.client.variation(feature_name, user_object, nil)
  end
sunnybogawat commented 2 months ago

In our previous code, the LD client was initialized in the initializer file, but it would only be initialized when the has_feature? method was first called. This was working as expected. However, after upgrading to Rails 6.1, we encountered an issue where the flag was not being evaluated correctly, so we updated the code to initialize the LD client from the Puma boot process. Even after this change the issue is not resolved

`on_worker_boot do ActiveSupport.on_load(:active_record) do

# Initialize LaunchDarkly client
LaunchDarkly.client

end end`

With this change the flag evaluation is working es expected in local development environment. When a flag changes, I can see the stream receiving updated flags. However, the issue only seems to occur in the integration and production environments.

Do you have any recommendations for upgrading to a version of LaunchDarkly that’s compatible with Rails 6.1 and Ruby 2.6.6?

I appreciate your advice on the code snippet and will keep it in mind!

sunnybogawat commented 2 months ago

Any update on the above observations?

nerdrew commented 1 month ago

What http server are you using (e.g. unicorn, puma, etc)? Does it fork?

sunnybogawat commented 1 month ago

We are using puma server.

keelerm84 commented 1 month ago

@sunnybogawat Sorry for the delay on this.

I tried testing this on a new rails project and it seems to work without issue for me.

In config/puma.rb, I added the following config

require 'ldclient-rb'

workers ENV.fetch("WEB_CONCURRENCY") { 10 }

preload_app!

on_worker_boot do
  Rails.configuration.client = LaunchDarkly::LDClient.new(ENV['LAUNCHDARKLY_SDK_KEY'])
end

If you aren't running puma in clustered mode, then you shouldn't need to wrap the client instantiation in the on_worker_boot callback.

To make the instance also available in the rails console, I added a `config/initializers/launchdarkly.rb' file that contains

require 'ldclient-rb'
Rails.application.console do
  Rails.configuration.client = LaunchDarkly::LDClient.new(ENV['LAUNCHDARKLY_SDK_KEY'])
end

One thing to note is that I did do this testing with the latest version of the SDK. I don't think this should matter, but it is worth mentioning since you stated you are on v5.8.0. I did want to point out that the 5.x release series has been EOL since 2022-01-26 so we strongly encourage you work on updating.

sunnybogawat commented 1 month ago

Thanks for response We are doing same things but looks like that not solved my issue

in my puma.rb file I have

before_fork do
 # Close the LaunchDarkly client to prevent resource leakage
  LaunchDarkly.close_client
end

on_worker_boot do
  # Initialize LaunchDarkly client
    LaunchDarkly.client
end

config/initializers/launch_darkly.rb

module LaunchDarkly
  def self.client
    config = if Rails.env.test?
               file_source = Rails.root.join('spec', 'fixtures', 'launch-darkly-test-flags.json')
               {
                 data_source: LaunchDarkly::FileDataSource.factory(paths: [file_source], auto_update: true),
                 send_events: false
               }
             else
               {
                 connect_timeout: 10,
                 read_timeout: 10,
                 stream: true
               }
             end

    # Set sdk_key to LaunchDarkly SDK key before running
    # It's important to make LDClient a singleton. The client instance maintains
    # an internal state that allows us to serve feature flags without making any remote requests.
    # Be sure of not to instantiate a new client with every request.
    @client ||= LaunchDarkly::LDClient.new(Settings.launch_darkly[:sdk_key], LaunchDarkly::Config.new(config))
  end

  def self.close_client
    # Here we ensure that the SDK shuts down cleanly and has a chance to deliver analytics
    # events to LaunchDarkly before the program exits. If analytics events are not delivered,
    # the user properties and flag usage statistics will not appear on your dashboard. In a
    # normal long-running application, the SDK would continue running and events would be
    # delivered automatically in the background.
    @client&.close
  end
end
keelerm84 commented 4 weeks ago

Calling @client&.close will shut down the thread that receives updates from the LaunchDarkly API. So once you call your close_client function, the @client instance is essentially frozen in time.

on_worker_boot you try to re-initialize the client by calling your client method. But when you hit this line:

    @client ||= LaunchDarkly::LDClient.new(Settings.launch_darkly[:sdk_key], LaunchDarkly::Config.new(config))

you aren't creating a new instance of the client since that variable isn't nil.

If you change your close_client method to set @client = nil after your null-safe close call, I think that might resolve your issue.

sunnybogawat commented 3 weeks ago

I appreciate the information you provided. I have tested the solution and it was successful.

sunnybogawat commented 3 weeks ago

I appreciate all the details and discussion we've had about this matter. I am now concluding our conversation.