amplitude / experiment-ruby-server

Amplitude Experiment Ruby Server SDK
MIT License
2 stars 4 forks source link

LocalEvaluation + Puma with Worker Processes Requires Reconnect #40

Open therabidbanana opened 1 year ago

therabidbanana commented 1 year ago

I set up AmplitudeExperiment with LocalEvaluation in a Puma environment, calling #start on the client during Rails initialization.

Expected Behavior

Each worker process is able to see updates to flag configuration or error is logged alerting you to dead polling thread and potentially stale results.

Current Behavior

No errors are raised, but the flag configurations are not updated after processes fork and so the flag configurations become stale even though logs show polling behavior happening.

It appears to be that the polling thread is dead when logging from the worker, but the thread is still active in the main process and logging updates (with the logger fix from #38) , which is misleading.

Captured below - we see a fetch of flagVersion 17 but then a few seconds later when inspecting the experiment client we see it is stuck at 16 and poller_thread shows status of dead in the inspect:


08:26:11 web.1    |
08:26:29 web.1    | [Experiment] Fetch flag configs: [....,"flagKey":"initial_test","flagVersion":17,......]
08:26:33 web.1    | Started GET "/health" for ::1 at 2023-05-22 08:26:33 -0700
08:26:33 web.1    | Processing by HealthController#index as HTML
08:26:33 web.1    |   ↳ app/controllers/health_controller.rb:5:in `index'
08:26:34 web.1    | #<AmplitudeExperiment::LocalEvaluationClient:0x000000010fc25ed8 @api_key="server-,...", @config=#<AmplitudeExperiment::LocalEvaluationConfig:0x000000010fc26018 @debug=true, @server_url="https://api.lab.amplitude.com", @bootstrap={}, @flag_config_polling_interval_millis=30000>, 
@flags="[{.... \"flagKey\":\"initial_test\",\"flagVersion\":16,...", 
@flags_mutex=#<Thread::Mutex:0x000000010fc0cac8>, 
@logger=#<Logger:0x000000010fc0ca50 @level=0, @progname=nil, @default_formatter=#<Logger::Formatter:0x000000010fc0c988 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x000000010fc0c898 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<IO:<STDOUT>>, @binmode=false, @mon_data=#<Monitor:0x000000010fc0c820>, @mon_data_owner_object_id=64200>>, @fetcher=#<AmplitudeExperiment::LocalEvaluationFetcher:0x000000010fc0c640 @api_key="server-Xz4ocdyihcxoSRvSeaJZiwdQEBWdUFri", @uri="https://api.lab.amplitude.com/sdk/rules?eval_mode=local", @debug=true, @http=#<Net::HTTP api.lab.amplitude.com:443 open=true>, @logger=#<ActiveSupport::Logger:0x000000010f3d8c80 @level=0, @progname=nil, @default_formatter=#<Logger::Formatter:0x000000010f3e17e0 @datetime_format=nil>, @formatter=#<ActiveSupport::Logger::SimpleFormatter:0x000000010f3d8ac8 @datetime_format=nil, @thread_key="activesupport_tagged_logging_tags:64240">, @logdev=#<Logger::LogDevice:0x000000010f3e16a0 @shift_period_suffix="%Y%m%d", @shift_size=1048576, @shift_age=0, @filename="/Users/david/pathstream/intake/log/development.log", @dev=#<File:/Users/david/pathstream/intake/log/development.log>, @binmode=false, @mon_data=#<Monitor:0x000000010f3e1628>, @mon_data_owner_object_id=59680>>>, 
@is_running=true, 
@poller_thread=#<Thread:0x000000010fc64de0 /Users/david/.rvm/gems/ruby-3.1.3/gems/amplitude-experiment-1.1.0/lib/experiment/local/client.rb:83 dead>>

Possible Solution

Either documentation to describe how to achieve expected behavior (you must call #start after process fork), or better handling of this case to automatically achieve desired behavior (example from Redis client watches a cached @pid variable to detect a fork: https://github.com/redis-rb/redis-client/commit/bec4931c952b6196c76977356238dbaf4f2e1293 )

Environment

bgiori commented 1 year ago

Hi @therabidbanana sorry for missing this issue.

Could you validate that my warning in PR #42 is accurate?

therabidbanana commented 1 year ago

That sounds right - added a comment with a bit more detail.

bgiori commented 1 year ago

Thanks! I'll keep this issue open to track that this still needs to be updated.