Open preaction opened 8 years ago
On the Fastly website (http://fastly.com), I've updated the timeouts for "First bytes" to 60,000 ms (60s). This should drastically reduce the amount of 503s we get. We should check back with people in a few days to see if they've gotten any more 503s.
@preaction This is still an issue. Just got the following:
Error 503 backend read error
backend read error
Guru Mediation:
Details: cache-dfw1823-DFW 1466788517 1422049808
@wchristian just reported this again as well. This time it seems that the view-report.cgi
was being hammered (as usual). @mst has volunteered to try something interesting, but otherwise has also suggested Plack::Handler::WrapCGI
with its execute
flag to do a pre-forking CGI script that saves us from having to load a bunch of modules every time. If we restrict it to just the view-report.cgi
for now, we can see how this helps us before deciding if we need to roll it out to more of the CGI scripts.
During the present errors I also noticed that we were spending 40-90% CPU time in iowait. I checked iostat
and it seems like there's a lot of writing to disk going on. I probably need to track down where the writes are coming from and see if any can be reduced. We've got a lot of free memory, which concerns me that it isn't being used for disk i/o caching. I know we've got a lot of various log files being written, which it might be possible to reduce using syslog. But I also enabled the data release (#9) which will certainly increase the amount of disk we're writing at various times. It might be necessary to move the website to another machine, separate from the database and backend processes...
When we added Fastly to get caching and improve performance, we started getting intermittent
503 backend read error
messages. We need to make sure the Fastly caching is working, figure out why the 503 errors are happening, and try to reduce them.