phusion / passenger

A fast and robust web server and application server for Ruby, Python and Node.js
https://www.phusionpassenger.com/
MIT License
4.99k stars 548 forks source link

standalone setup not resolving every 8th request #2117

Open dsusco opened 6 years ago

dsusco commented 6 years ago

I'm fairly confident this is a passenger issue, as rolling back to 5.1.12 fixed it for me. It may be configuration related, so here's a brief rundown of how things are setup here.

I'm running a Rails 5.2.1 app that's dropping every 8th request. It doesn't matter the HTTP method, nor the controller I'm going to, I get something like this every 8th one:

I, [2018-09-12T09:31:17.317366 #7410]  INFO -- : [8c8358df-f722-4cfd-ad13-32aa11e98517] Started GET "/testy/" for 140.232.0.192 at 2018-09-12 09:31:17 -0400
F, [2018-09-12T09:31:17.319266 #7410] FATAL -- : [8c8358df-f722-4cfd-ad13-32aa11e98517]   
F, [2018-09-12T09:31:17.319387 #7410] FATAL -- : [8c8358df-f722-4cfd-ad13-32aa11e98517] ActionController::RoutingError (No route matches [GET] "/testy"):
F, [2018-09-12T09:31:17.319429 #7410] FATAL -- : [8c8358df-f722-4cfd-ad13-32aa11e98517]   
F, [2018-09-12T09:31:17.319483 #7410] FATAL -- : [8c8358df-f722-4cfd-ad13-32aa11e98517] actionpack (5.2.1) lib/action_dispatch/middleware/debug_exceptions.rb:65:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] actionpack (5.2.1) lib/action_dispatch/middleware/show_exceptions.rb:33:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] railties (5.2.1) lib/rails/rack/logger.rb:38:in `call_app'
[8c8358df-f722-4cfd-ad13-32aa11e98517] railties (5.2.1) lib/rails/rack/logger.rb:26:in `block in call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] activesupport (5.2.1) lib/active_support/tagged_logging.rb:71:in `block in tagged'
[8c8358df-f722-4cfd-ad13-32aa11e98517] activesupport (5.2.1) lib/active_support/tagged_logging.rb:28:in `tagged'
[8c8358df-f722-4cfd-ad13-32aa11e98517] activesupport (5.2.1) lib/active_support/tagged_logging.rb:71:in `tagged'
[8c8358df-f722-4cfd-ad13-32aa11e98517] railties (5.2.1) lib/rails/rack/logger.rb:26:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] actionpack (5.2.1) lib/action_dispatch/middleware/remote_ip.rb:81:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] actionpack (5.2.1) lib/action_dispatch/middleware/request_id.rb:27:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] rack (2.0.5) lib/rack/method_override.rb:22:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] rack (2.0.5) lib/rack/runtime.rb:22:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] activesupport (5.2.1) lib/active_support/cache/strategy/local_cache_middleware.rb:29:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] actionpack (5.2.1) lib/action_dispatch/middleware/executor.rb:14:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] rack (2.0.5) lib/rack/sendfile.rb:111:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] railties (5.2.1) lib/rails/engine.rb:524:in `call'
[8c8358df-f722-4cfd-ad13-32aa11e98517] passenger (5.2.3) src/ruby_supportlib/phusion_passenger/rack/thread_handler_extension.rb:97:in `process_request'
[8c8358df-f722-4cfd-ad13-32aa11e98517] passenger (5.2.3) src/ruby_supportlib/phusion_passenger/request_handler/thread_handler.rb:152:in `accept_and_process_next_request'
[8c8358df-f722-4cfd-ad13-32aa11e98517] passenger (5.2.3) src/ruby_supportlib/phusion_passenger/request_handler/thread_handler.rb:113:in `main_loop'
[8c8358df-f722-4cfd-ad13-32aa11e98517] passenger (5.2.3) src/ruby_supportlib/phusion_passenger/request_handler.rb:416:in `block (3 levels) in start_threads'
[8c8358df-f722-4cfd-ad13-32aa11e98517] passenger (5.2.3) src/ruby_supportlib/phusion_passenger/utils.rb:113:in `block in create_thread_and_abort_on_exception'

I have passenger running with the 5.3.4 config file: resources/templates/standalone/config.erb. The only change I've made to this is to add a base uri:

...
# https://www.phusionpassenger.com/library/config/standalone/intro.html#nginx-configuration-template
passenger_base_uri '/testy';
### END your own configuration options ###
...

I start up passenger with the following command:

passenger start -d -e production -p 5000 --nginx-config-template nginx.config.erb

And have an nginx server forwarding requests to it like so:

location /testy/ {
  proxy_pass http://127.0.0.1:5000/testy/;

  proxy_buffer_size 32k;
  proxy_buffering off;
  proxy_buffers 10 16k;
  proxy_busy_buffers_size 32k;

  proxy_http_version 1.1;

  proxy_set_header Host $http_host;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header X-Forwarded-Host $host;
  proxy_set_header X-Forwarded-Proto https;
}

Any thoughts or ideas?

CamJN commented 6 years ago

Does the interval that the problem occurs at change if you set a different max-pool-size?

dsusco commented 6 years ago

It does not, I added the flag to my startup command and doubled it from 6 to 12.

CamJN commented 5 years ago

Does this issue persist in Passenger 6?

dsusco commented 5 years ago

It does.

CamJN commented 5 years ago

Can you turn the log level up to 7 and record both a successful and unsuccessful request?

david-crowder commented 5 years ago

Not to hijack someone else's issue, but I am running into a very similar problem and was wondering if a resolution was ever found? I documented some of the initial investigation I did on stackoverflow because I wasn't sure if I was just configuring passenger inappropriately. Running in log level 7, it looks like the SCRIPT_NAME header is not getting set appropriately on the requests that fail to route. Setting the pool size to 1 doesn't resolve the issue, which is interesting, because it means the same process is handling the same request differently?

Another interesting piece of information that I can't wrap my brain around, and that may differ from the original poster, is we deploy our passenger applications via docker images and use dc/os for container orchestration. Running the docker container directly on a given host (or locally) produces a 100% success rate for a test endpoint. That same test endpoint on a container deployed through dc/os has a 50-75% success rate. I am not entirely sure why this is the case, or if it would indicate that the problem does not reside in passenger, but I don't really know how to debug further.

david-crowder commented 5 years ago

I was able to reproduce the issue with a simple app that was the result of a rails generate and I think I figured out why my testing was behaving differently between local and the container orchestrator. When I was testing locally, or by manually deploying the docker container, I would hit the sub context only. I never hit the root context. When the container was deployed via dc/os the healthcheck would hit the root context before I would hit the base_uri. It looks to cache that initial request somehow? Then every request after that could get a random 404. I verified that if I repeated these steps on 5.1 it did not behave this way.

CamJN commented 5 years ago

Looks like the config caching changes in 5.2 might be involved.

oprogfrogo commented 5 years ago

I too am experiencing the same issue. I get the 200 then a route not found on every consecutive GET request. Has anyone gotten this to work?

Update: Seems downgrading back to 5.1.12 worked.

CamJN commented 4 years ago

seems related: https://github.com/phusion/passenger/issues/2043

dsusco commented 4 years ago

Just reporting that I'm still experiencing this issue with every version of Passenger after 5.1.12.

CamJN commented 4 years ago

Is the passenger_base_uri '/testy'; in the http block or the sever block?

oprogfrogo commented 4 years ago

For me, I have it in the nginx.conf.erb file sitting in the app's root folder. It resides in the http block:

# BEGIN your own configuration options ###                                                                                                                                                                                                                                                                                                                            
# This is a good place to put your own config                                                                                                                                                                                                                                                                                                                           
# options. Note that your options must not                                                                                                                                                                                                                                                                                                                              
# conflict with the ones Passenger already sets.                                                                                                                                                                                                                                                                                                                        
# Learn more at:                                                                                                                                                                                                                                                                                                                                                        
# https://www.phusionpassenger.com/library/config/standalone/intro.html#nginx-configuration-template                                                                                                                                                                                                                                                                    
passenger_base_uri '/name_of_my_app';
### END your own configuration options ###

I too can confirm that Passenger after 5.1.12 is not working.

oprogfrogo commented 4 years ago

For me, every other reload was causing a 404. I was able to resolve my issue by just simply precompiling assets and moving to passenger 6.0.4. Also my nameserver was wrong so the server was taking a long time to resolve. Maybe my issue may not have been the same as yours but worth giving it shot on your end.

CamJN commented 4 years ago

Ok i've reproduced this issue, it looks like the when we pull a PoolOptions object from the cache instead of creating a new one, the options.baseURI member variable is not set.

epochslee commented 4 years ago

Is options.baseURI suppose to be what is set for passenger_base_uri?

CamJN commented 4 years ago

options.baseURI is supposed to receive the value set as the passenger_base_uri, yes. I'm going to trace some more requests and try and figure out why that's not happening. But we're much closer to a solution, knowing what the problem is.

CamJN commented 4 years ago

Can folks please try out the patch here: https://github.com/phusion/passenger/commit/9d32868d3b06303d2778e24bf900004b1c72bd8c.patch and report back with their findings? It works for me locally but I'd like more testing before landing the patch.

oprogfrogo commented 4 years ago

Using: Rails 3.2.22.4 ruby 2.3.8p459 Phusion Passenger 6.0.4

I've add that line in the gem's code but still experience the same:

Controller::initializePoolOptions(Client *client, Request *req, RequestAnalysis &analysis) {
        boost::shared_ptr<Options> *options;

        if (mainConfig.singleAppMode) {
                P_ASSERT_EQ(poolOptionsCache.size(), 1);
                poolOptionsCache.lookupRandom(NULL, &options);
                req->options = **options;
        } else {
                ServerKit::HeaderTable::Cell *appGroupNameCell = analysis.appGroupNameCell;
        if (appGroupNameCell != NULL && appGroupNameCell->header->val.size > 0) {
                        const LString *appGroupName = psg_lstr_make_contiguous(
                                &appGroupNameCell->header->val,
                                req->pool);
                        HashedStaticString hAppGroupName(appGroupName->start->data,
                                appGroupName->size);

                        poolOptionsCache.lookup(hAppGroupName, &options);

            if (options != NULL) {
                                req->options = **options;
                                fillPoolOption(req, req->options.baseURI, "!~SCRIPT_NAME");
            } else {
                                createNewPoolOptions(client, req, hAppGroupName);
            }
        } else {
                        disconnectWithError(&client, "the !~PASSENGER_APP_GROUP_NAME header must be set");
        }
    }

        if (!req->ended()) {
        // See comment for req->envvars to learn how it is different                                                                                                                                                                                                                                                                                                
        // from req->options.environmentVariables.                                                                                                                                                                                                                                                                                                                  
        req->envvars = req->secureHeaders.lookup(PASSENGER_ENV_VARS);
        if (req->envvars != NULL && req->envvars->size > 0) {
            req->envvars = psg_lstr_make_contiguous(req->envvars, req->pool);
            req->options.environmentVariables = StaticString(
                req->envvars->start->data,
                req->envvars->size);
        }

                // Allow certain options to be overridden on a per-request basis                                                                                                                                                                                                                                                                                            
        fillPoolOption(req, req->options.maxRequests, PASSENGER_MAX_REQUESTS);
    }
}
CamJN commented 4 years ago

If you installed via the gem, you may need to manually put the executable into place after compiling. Can you list the steps you used?

oprogfrogo commented 4 years ago
  1. gem install passenger
  2. emacs /var/lib/gems/2.3.0/gems/passenger-6.0.4/src/agent/Core/Controller/InitRequest.cpp
  3. cd to app which has an nginx.conf.erb with the passenger_base_uri
  4. passenger start
  5. navigate to page where all the links and assets are missing the passenger_base_uri in front all paths.
CamJN commented 4 years ago

So i replaced your step 2 with this process:

apt install -y ruby-dev build-essential curl libssl-dev zlib1g-dev libcurl4-openssl-dev
cd /var/lib/gems/*/gems/passenger-6.0.4/
curl https://github.com/phusion/passenger/commit/9d32868d3b06303d2778e24bf900004b1c72bd8c.patch | patch -p1
echo '127.0.0.1 oss-binaries.phusionpassenger.com github.com s3.amazonaws.com' >> /etc/hosts
bin/passenger-config install-agent --skip-cache --force

and the agent was put in place properly.

Note: you will want to remove the extra line i add to the hosts file, i just needed to force a recompile and breaking the DNS is the easiest way to do that in the gem unfortunately.

oprogfrogo commented 4 years ago

Deleting this error. Needed to kill PassengerAgents first. Trying again.

oprogfrogo commented 4 years ago

The problem continues to occur for me. The app will return an http response of 200->404->200->404 etc.

CamJN commented 4 years ago

It's interesting that your setup experiences the issue every 2nd request instead of every 8th, is there anything in your log or configuration that might shed some light on why that is?

CamJN commented 4 years ago

Also, based on this:

navigate to page where all the links and assets are missing the passenger_base_uri in front all paths.

It sounds like your app was generating incorrect urls? Might that be a config issue? As I understand it, the original issue was that Passenger was sending incorrect requests to the app, but if the url doesn't contain the base uri, then I'm not sure passenger is doing something wrong...