socketry / falcon

A high-performance web server for Ruby, supporting HTTP/1, HTTP/2 and TLS.
https://socketry.github.io/falcon/
MIT License
2.6k stars 79 forks source link

Fork Events #53

Closed bryanp closed 5 years ago

bryanp commented 5 years ago

I'd like to register events handlers to be called before / after a process fork, so that things like database connections can be disconnected / reconnected properly. In Puma, I would use:

Not seeing a way to do the same thing with Falcon. Do you have any input on how to support?

ioquatix commented 5 years ago

There is no guarantee that your process would be forked or threaded.

Falcon will pre-load your config.ru before forking/threading.

https://github.com/socketry/falcon/blob/master/lib/falcon/command/serve.rb#L104

So, everything is loaded into the falcon process by default.

There is no reason why we couldn't add some hooks, but to me this is very app-server specific. I'd prefer if there was a general solution.

Can you show me some example hooks so I can see exactly what kind of things are needed?

One option would be to move load_app into the container instance. That isn't a stupid idea actually.

Then you could use warmup in your config.ru which would be executed per-thread or per-process.

warmup do |app|
  ActiveRecord::Base.establish_connection(...)
end

It might still make sense to pre-load some dependencies, but as this doesn't affect memory usage, maybe it can be done by adding a file, e.g. config/preload.rb which is loaded into the pre-fork falcon environment. It wouldn't have any logic, just requires which minimise the memory overheads when forking.

ioquatix commented 5 years ago

Keep in mind that rolling restarts are going to form an important part of the falcon process model. I'm not sure the best way we achieve this but I suspect we will have some top level falcon process which spawns per-site instances of falcon which can be recycled. These themselves can either be forked or threaded. Anything we do to solve the issue here should be compatible with that design.

bryanp commented 5 years ago

Sure. In context of the pakyow web framework, the application server is configured to call either Pakyow.fork, Pakyow.forking, or Pakyow.forked (link):

def fork
  forking
  yield
  forked
end

def forking
  call_hooks :before, :fork

  @apps.select { |app|
    app.respond_to?(:forking)
  }.each(&:forking)
end

def forked
  call_hooks :after, :fork

  @apps.select { |app|
    app.respond_to?(:forked)
  }.each(&:forked)

  booted
end

When using puma, hooks are registered that call the appropriate method (link):

setting :before_worker_fork do
  [
    lambda { |_| Pakyow.forking }
  ]
end

setting :after_worker_fork do
  []
end

setting :before_worker_boot do
  [
    lambda { |_| Pakyow.forked }
  ]
end

Finally, hooks are registered within the framework that perform certain tasks before/after fork (link):

require "pakyow/support/extension"

module Pakyow
  module Environment
    module Data
      module Forking
        extend Support::Extension

        apply_extension do
          before :fork do
            connections_to_reconnect.each(&:disconnect)
          end

          after :fork do
            connections_to_reconnect.each(&:connect)
          end
        end

        class_methods do
          def connections_to_reconnect
            @data_connections.values.flat_map(&:values).reject { |connection|
              connection.name == :memory
            }
          end
        end
      end
    end
  end
end

Does that help? Happy to talk through this more.

ioquatix commented 5 years ago

Do you think this design makes sense in the context of threads/forks? i.e. forking doesn't exist on TruffleRuby so how should it behave?

bryanp commented 5 years ago

If there isn't a process fork I don't think there's anything to do. It's only needed in the case above because the fork breaks the connection in the parent process (more discussion).

ioquatix commented 5 years ago

If the parent process doesn't have any application state, does it still make sense?

bryanp commented 5 years ago

Nope. We implement a process model similar to what you describe above for development. A master process is started that loads dependencies, then forks to spin up the application server. Requests are passed through the master process to the child. When code changes, we replace the child process. In this case there's no need to call forked because application state doesn't exist in the parent.

ioquatix commented 5 years ago

I think the model you just described makes the most sense.

Pre-fork code loading shouldn't affect API because it's a leaky abstraction (doesn't make sense for threads, forces all libraries to support some kind of model for fork if they have any kind of connection/state).

In my mind, I think for falcon, it would make the most sense to use bundler to pre-load dependencies by default, i.e. an optional pre-fork could just do bundler/setup and Bundler.require :prefork.

This should work for 99% of situations where the point of pre-fork is to minimise memory and startup time (due to loading/parsing dependencies).. The downside with this model is prefork is only really useful in production because you can't just restart by killing the child process (state is captured by pre-fork process that might need to be reloaded).

bryanp commented 5 years ago

Yep, makes sense. To handle that last point, when the bundle changes we stop the child process and use exec to replace the parent process. It works well enough for a development server.

ioquatix commented 5 years ago

Interesting, so you reload the master process in place?

bryanp commented 5 years ago

Yep! It isn't quite as smooth of a user experience, but it works great. When replacing a child, the parent process holds any requests that come in and waits for the child to become available. You don't get that behavior and so requests will fail, but it prevents the user from having to manually restart.

ioquatix commented 5 years ago

Falcon already does this to some extent - because the parent process binds a socket and shares it across fork/threads.

All we need is an intermediate process to do the pre-fork, and that can manage child processes. I think we can implement this, but the most logical place to do it is in async-container. It's not falcon specific - we need to extend async-container with a new container type, probably called Prefork which does the behaviour described. The prefork container would make one child process via fork, do bundler magic, and then make another container of children.

ioquatix commented 5 years ago

Then, in falcon, we'd do something like falcon serve --preforked rather than --forked.

bryanp commented 5 years ago

That sounds perfect. How can I help?

ioquatix commented 5 years ago

Sorry, I will make a new issue, rather than move this one.

ioquatix commented 5 years ago

I wrote a brief outline of what I think would work on https://github.com/socketry/async-container/issues/3

But you seem to be experienced in this, so I leave it up to you to submit a PR.

async-container is actually fine for now, but at some point I want the ability to add/remove children processes, and handle restarting crashed processes correctly. Right now, it's fixed. That's okay, but it's not very flexible. The preforked container shouldn't make too many assumptions about how that will work going forward, I suspect a lot of it would be refactored.

But if you can show the general concept works, I think it's great and we can merge into current "inflexible" design with an idea to refactor later.

ioquatix commented 5 years ago

I've been working on how to support this and it should work out of the box using falcon virtual to host applications.

For individual apps, using warmup makes the most sense. For bundler stuff, it's already loaded by default.

bryanp commented 5 years ago

👍 I had forgotten about this, but makes sense. Ended up not needing it for my use case anyway.

ayanko commented 5 years ago

@ioquatix @bryanp

warmup

Unfortunately It's invoked after application is build and it is executed in master process that forks containers....

Some gems still have connection issues on fork (see https://github.com/thiagopradi/octopus/issues/489) so I need to add feature for forked container hooks.

What should be the best way to do it?

Currently I did it via custom serve command but I would have less falcon extensions.

bin/falcon-serve

#!/usr/bin/env ruby

require 'falcon'

require_relative '../lib/falcon_serve/configuration'
require_relative '../lib/falcon_serve/dsl'
require_relative '../lib/falcon_serve/forked_container'
require_relative '../lib/falcon_serve/command'

FalconServe::Configuration.load 'config/falcon.rb'

begin
  FalconServe::Command.call
rescue Interrupt
end

lib/falcon_serve/configuration.rb

module FalconServe
  module Configuration
    class << self
      def load(path)
        @options = {}
        DSL.new(@options).instance_eval(File.read(path), path, 1)
      end

      def [](name)
        @options[name]
      end

      def []=(name, block)
        @options[name] = block
      end
    end
  end
end

lib/falcon_serve/dsl.rb

module FalconServe
  class DSL
    def initialize(options)
      @options = options
    end

    %i[
      before_fork
      after_instance_fork
    ].each do |name|
      define_method name do |&block|
        @options[name] = block
      end
    end
  end
end

lib/falcon_serve/forked_container.rb

module FalconServe
  class ForkedContainer < Async::Container::Forked
    def run(*)
      FalconServe::Configuration[:before_fork].call
      super
    end

    def async(**options, &block)
      spawn(**options) do |instance|
        FalconServe::Configuration[:after_instance_fork].call

        begin
          Async::Reactor.run(instance, &block)
        rescue Interrupt
          # Graceful exit.
        end
      end
    end
  end
end

I wanted to avoid copy of implementation of async method but it's not possible on current codebase...

lib/falcon_serve/command.rb

module FalconServe
  class Command < Falcon::Command::Serve
    def container_class
      if @options[:container] == :forked
        FalconServe::ForkedContainer
      else
        super
      end
    end

    def call
      container = run
      container.wait
    end
  end
end
ioquatix commented 5 years ago

Async::Container isn't guaranteed to run in forked process, it may run using threads.

This is why this situation is tricky.

Are you concerned about compatibility with JRuby & TruffleRuby? It cannot use fork, only threads. But each thread should be isolated.

The best solution might be something like --preload which invokes the current behaviour, and changing the current behaviour to load the app only in the child container. This way, it should work in threads and processes more easily.

What do you think?

ayanko commented 5 years ago

Async::Container isn't guaranteed to run in forked process, it may run using threads.

I am ok with it because nobody expects that before_fork and after_instance_fork will be executed on thread environment. It's forked container specific options. If we need thread specific stuff we can add smth like after_thread_spawn. You never want to have the SAME configuration block that should be executed either in fork or thread container (I believe).

Or maybe we can better naming:

Or nested configuration structure

forking do
  before {}
  on_instance {}
end

threading do
  before {}
  on_instance {}
end
ayanko commented 5 years ago

The best solution might be something like --preload

How it will work? Does it allow to setup "disconnect connections stuff" on master process and "connect connection stuff" on forked child?

ioquatix commented 5 years ago

I understand your proposal and it seems reasonable, but first I want to figure out, is this the path we want to take, because after introducing such hooks it will cause a lot of pain to remove it if it turns out to be a bad idea.

What I want to understand is if there is some inescapable semantic model which requires these hooks, or is it just working around some issues in upstream code. Because introducing such a specific interface is a big deal, I need to have a clear vision of how this fits into the big picture.

Regarding how this works, the key point is: why do we load the app before forking?

Well, if your app cannot support going over fork boundaries, because we decided above it's due to a deficiency of the design of the app, then one option is to load the app in the child process.

Can you give me your feedback?

ioquatix commented 5 years ago

One more point to consider, if the simplest model is just to load the app in the child process/thread, can we still save a lot of memory by doing Bundle.load or something similar in the parent process?

ayanko commented 5 years ago

What I want to understand is if there is some inescapable semantic model which requires these hooks, or is it just working around some issues in upstream code.

Almost all existing ruby servers that are designed with performance (e.g puma, unicorn, rainbows) have fork hooks. I don't know if it was really introduced as workaround for fork issues with application frameworks. If framework/gem currently has that issue it means I can't use falcon until they fix it or I have to patch either framework/gem or falcon.... But what if really application that can be forked requires that some stuff should be configured before and after forking? Why not?

Regarding how this works, the key point is: why do we load the app before forking?

The answer is obvious. We load (preload) app in master process in order to speed up booting process and spare memory. All modern servers do it in production mode at least. Correct?

then one option is to load the app in the child process

It's not efficient or?

ayanko commented 5 years ago

can we still save a lot of memory by doing Bundle.load or something similar in the parent process?

It's related to copy-on-write ruby feature. I am not expert in this are. But probably we can measure amount of memory is used in both cases. Production apps are usually fat so it might play big role.

ioquatix commented 5 years ago

Until we have empirical evidence I am unlikely to just implement it because other server does it. I want to know why it’s required and what the impact is of the various options. So we should measure it if possible.

Because other server has much simple execution model, we need to be very careful about such feature and the burden it creates both on the server implementation and the expectations of how to build client applications. Bearing in mind that rack model is only one supported mode of operation.

ioquatix commented 5 years ago

Do you have an application we can use to measure?

ioquatix commented 5 years ago

Do you think such hooks should be added to rack spec?

ayanko commented 5 years ago

want to know why it’s required and what the impact is of the various options.

I already provided arguments. Not all apps/frameworks are ready out of box for forking. It maybe they fault but maybe not. Thus other servers have hooks. It means there were a lot cases (and will be in the future i believe) when you should have to tune setup in fork environment. Otherwise server is useless for production case!

If falcon does not plan to support this general feature its bad of course. And I have to extend it in way I suggested before. Which is not good for other people as well because they need to do the same...

And loading app in child process is step back in performance term...

ayanko commented 5 years ago

Do you have an application we can use to measure?

Yes I have really big production application. Actually I want to switch from EventMachine based server to Async one. Because EM is outdated and Async has better api which should allow me give more opportunities with parallelism for inbound/outbound connections.

ayanko commented 5 years ago

Do you think such hooks should be added to rack spec?

We could add it to config.ru file instead If they don't broke other servers of course. Does rackup specification officially supports custom options?

ioquatix commented 5 years ago

I know it must be frustrating because you think you have a solution to the problem and feel that existing servers show it to be the right solution, but the reality is this problem is not so simple as adding a hook. So, try to understand my position to move things forward. If I just accept status quo, Async would not even exist.

You state that loading the app should be faster. But that shouldn't actually matter much, and pre-fork model introduces some complexity too.

For example, rolling restart cannot use such a pre-fork model.

Memory usage may be good once child process is forked, but over time, after GC, etc, it becomes same as if not loading app in parent process. The real saving is shared code.

You say it should be faster, but you don't state how. Well, here are the two scenarios:

To me, the only difference is CPU usage, but the latency is identical.

So, the biggest win might be to preload bundler, and load app in individual thread or process.

It avoids the need for any additional hooks, it's simpler model for user code and overall there is no difference in latency - maybe slight difference in initial load.

Then, there is rolling restarts to consider. It's not possible to do such a thing when app is loaded in parent process unless somehow you restart parent process or use much more complex process for fork e.g. parent -> child (load app) -> baby (serve traffic).

So, please accept that this is non-trivial issue and try to help me with the information I need to make a good decision rather than just stating "I've provided arguments." What I need is data, e.g. latency, load average, graphs, comparisons, etc.

Otherwise, I cannot make a good decision.

ayanko commented 5 years ago

I cannot make a good decision.

Then I can make. :) Basically right now I have only one option - use extended forked container. Otherwise I cannot use falcon at all.

ioquatix commented 5 years ago

Please try preload branch. Please report startup time, latency, load average, memory usage (both parent process and children processes).

Ensure you add all appropriate gems to :preload group.

Please compare to current released master.

ayanko commented 5 years ago

I can give only numbers for comparison in development environment. Nobody allows me to play with production 😀