statsig-io / ruby-sdk

Statsig server SDK for Ruby
ISC License
4 stars 5 forks source link

Forking instructions unnecessary for multi-process environments? #16

Open hibachrach opened 1 year ago

hibachrach commented 1 year ago

After reading the code, it's unclear to me why the Statsig docs call for initializing the lib in a fork callback for Unicorn/Puma setups. Is the library using a shared system resource like a file descriptor or shared socket that isn't properly isolated via Ruby's copy-on-write?

If it is necessary, it's probably worth calling out for Sidekiq when in multi-process/"swarm" mode.

vijaye-statsig commented 1 year ago

@daniel-statsig

daniel-statsig commented 1 year ago

Hey @hibachrach, happy to help here. The SDK uses a singleton with multiple background threads to fetch values and flush logs. If the SDK is not initialized in these specific spots, it may end up in a invalid state and not work as intended.

Are you running into issues with initialization in Sidekiq that we need to fix, either with code or better documentation?

hibachrach commented 1 year ago

No issues! Just confused on the guidance. When fork is called in Ruby, those child processes are memory-isolated by the runtime (at least in MRI) and trigger copy-on-write when changes occur. So Unicorn's forking, for example, shouldn't be able to invalidate/corrupt any state being fiddled by the various threads that Statsig is creating/switching between. The only exception would be OS level resources like sockets or file descriptors (and the file system, of course).

For a multithreaded environment, what you're describing makes total sense--there, memory is shared and care must be taken to avoid invalidation/corruption. But in a multi-process environment, it's not clear why this is necessary.

daniel-statsig commented 1 year ago

after_fork is just a convenient place put the Statsig initialization as it would happen on Worker startup. This will ensure that each Worker has an initialized Statsig instance to interact with.

If you have a different, centralized location you would like to put it, it should be fine. The main thing to be wary of is that you must initialize before calling any other Statsig methods, otherwise an UninitializedError will be raised.

I believe the highlighted locations are valid call sites for initialize but I may be misunderstanding your meaning.