cfmlprojects / runwar

Other
11 stars 16 forks source link

Concurrency issues under load #75

Closed bdw429s closed 7 years ago

bdw429s commented 7 years ago

I don't know what version of Runwar introduced this, but when I run load tests against a simple page that just outputs a bit of text on either Lucee or Adobe CF, I get hundreds of exceptions that pour into the server log. It seems that there is a concurrency issue of some sort.

ErrorHandler handleRequest triggered
2017-03-21 22:15:50 ERROR RunwarLogger java.lang.NullPointerException
2017-03-21 22:15:50 ERROR RunwarLogger Location: / generated no content, maybe verify any errorPage locations? (status code: 200)

FWIW, I'll get 700 to 900 requests a second being processed. If I slow the requests down, the errors in the log reduce but are still present.

When I configured JMeter to save the responses from each request, 10% of them showed up as a blank response.

I can reproduce this easily with no real setup. Use the following commands to create a testbed:

CommandBox> mkdir mytest --cd
CommandBox> echo "hello" > index.cfm
CommandBox> server start --console

Then fire up JMeter to pound that page with about 200 concurrent threads.

bdw429s commented 7 years ago

I went back and tried this test on every version of CommandBox working backwards and the errors in the logs under load stopped in CommandBox version 3.2, which shipped with Runwar version 3.4.5. So this appears to have either been introduced in Runwar 3.4.10 or it just wasn't logged in previous versions.

denuno commented 7 years ago

Pretty sure the cause of this is the directory-index "refresh"-- hitting a URL path of / versus /index.cfm, which we added when folks noticed welcome-files were only checked for once, and that new ones added after the server started were not picked up.

In the future it would be great to attach the JMeter test, or at least the URL it's hitting, for scenario reproduction. :smile:

A quick way to test would be to pass in the full path, and also maybe disabling the index (--directory-index false).

If this is the cause, we can add another option, --directory-refresh [true|false] and default it to false, which I think would let one keep the directory index, if wanted, enabled on production-- at the cost of not picking up new 'index.cfm' or other welcome-files that are added after server start, by default.

(I'd recommend disabling directory indexing when in production, or during load testing anyway.)

It's really only a setting you'd want on in development, at least until some of the Undertow internals change so the path cache-invalidation we do isn't so hacky (and not thread-safe).

bdw429s commented 7 years ago

I can attach a Jmeter test, but it's really nothing you can't recreate in about 5 seconds. Just add a thread group with 200 users and an HTTP sampler that hits / on your site. It seems JMeter files aren't very compatible between major versions so I've had more misses than hits lately trying to share them. Plus your port and host will probably be different than mine, making 75% of the jmeter file in need of updating :)

Can we log a ticket with Runwar for them to provide a built in way to not cache the directory listing? This is the sort of thing that doesn't cache out of the box on IIS, Apache, Nginx, etc so it's a little surprising that they do it.

I'll try some more testing without directory index enabled.

denuno commented 7 years ago

Yeah, the main thing was figuring out what path to pound-- we've got fancy stuff for / that we don't pass through if the request is for /index.cfm, and when I saw "that page" I was like, "hmm, I could see something funky for /, but /index.cfm should work.".

Plus, I'm insanely lazy-- 5 seconds? Ain't nobody got time for that! :smiley:

I haven't had much trouble with JMeter across major versions-- I expect to have to change a few things, because major version. I should probably specify that the build uses the latest JMeter though, and perhaps even commit one or two of these tests, and tie them into CI, so we'll be collecting those great historical stats and know better how changes affect performance. (Mental note: create a ticket.)

Anyhow, this does appear to be what's going on, so I've added another option '--directory-refresh true|false`, and defaulted it to false. This kind of undoes the "fix" we put in there to work around that servlet path caching for welcome files (at least by default), so I think you're right, we should bring this up on the Undertow issue tracker, and see if our buddy Stuart has any ideas... it's totes possible I just missed some better way of doing it.

Personally, I like the idea of it only checking once. In general, you probably should not need to do a directory scan any time you get a request that ends with '/', in this day and age. So even if there is some way to change it, I think maybe we should add a --dev-mode option or something maybe (there are a few settings that you'd probably want one way for dev, and another for non-dev, but I guess being able to turn them on and off individually is a must too...) meh, just mulling, but I love how fast Undertow is by default, which I think is a better default than "dev mode", so to speak.

Oy, I can drag on! TL;DR: We'll want to account for this change in CommandBox via the new --directory-refresh option, which defaults to false. Too dangerous to enable by default as things are, IMHO.