Closed zooko closed 11 years ago
za and I are pairing on this.
We first noticed nginx showing many file permissions errors for things like /home/analytics/public_html/piwik/piwik.js
.
After looking at process ownership, file permissions, etc... we noticed this diff in /etc/nginx/nginx.conf
which appears to be due to debian install:
--- ./nginx.conf.dpkg-old 2013-07-31 11:26:15.000000000 -0700
+++ ./nginx.conf 2013-07-31 11:26:15.000000000 -0700
@@ -1,95 +1,33 @@
-user www-data;
-worker_processes 4;
-pid /var/run/nginx.pid;
+
+user nginx;
+worker_processes 1;
+
+error_log /var/log/nginx/error.log warn;
+pid /var/run/nginx.pid;
+
events {
- worker_connections 768;
- # multi_accept on;
+ worker_connections 1024;
}
+
http {
+ include /etc/nginx/mime.types;
+ default_type application/octet-stream;
- ##
- # Basic Settings
- ##
-
- sendfile on;
- tcp_nopush on;
- tcp_nodelay on;
- keepalive_timeout 2;
- types_hash_max_size 2048;
- # server_tokens off;
-
- # server_names_hash_bucket_size 64;
- # server_name_in_redirect off;
-
- include /etc/nginx/mime.types;
- default_type application/octet-stream;
-
- ##
- # Logging Settings
- ##
-
- access_log /var/log/nginx/access.log;
- error_log /var/log/nginx/error.log;
-
- ##
- # Gzip Settings
- ##
-
- gzip on;
- gzip_disable "msie6";
-
- # gzip_vary on;
- # gzip_proxied any;
- # gzip_comp_level 6;
- # gzip_buffers 16 8k;
- # gzip_http_version 1.1;
- # gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
-
- ##
- # nginx-naxsi config
- ##
- # Uncomment it if you installed nginx-naxsi
- ##
-
- #include /etc/nginx/naxsi_core.rules;
-
- ##
- # nginx-passenger config
- ##
- # Uncomment it if you installed nginx-passenger
- ##
-
- #passenger_root /usr;
- #passenger_ruby /usr/bin/ruby;
-
- ##
- # Virtual Host Configs
- ##
+ log_format main '$remote_addr - $remote_user [$time_local] "$request" '
+ '$status $body_bytes_sent "$http_referer" '
+ '"$http_user_agent" "$http_x_forwarded_for"';
- include /etc/nginx/conf.d/*.conf;
- include /etc/nginx/sites-enabled/*;
-}
+ access_log /var/log/nginx/access.log main;
+
+ sendfile on;
+ #tcp_nopush on;
+ keepalive_timeout 65;
-#mail {
-# # See sample authentication script at:
-# # http://wiki.nginx.org/ImapAuthenticateWithApachePhpScript
-#
-# # auth_http localhost/auth.php;
-# # pop3_capabilities "TOP" "USER";
-# # imap_capabilities "IMAP4rev1" "UIDPLUS";
-#
-# server {
-# listen localhost:110;
-# protocol pop3;
-# proxy on;
-# }
-#
-# server {
-# listen localhost:143;
-# protocol imap;
-# proxy on;
-# }
-#}
+ #gzip on;
+
+ include /etc/nginx/conf.d/*.conf;
+ include /etc/nginx/sites-enabled/*;
+}
The salient feature of the diff is that the user used to serve files changed from www-data
to nginx
, whereas the ~analytics/public_html
directory is still owned by the former group.
Za and I are about to make backup copies of both /etc/nginx/nginx.conf
and /etc/nginx/nginx.conf.dpkg-old
, and then to revert the change by copying /etc/nginx/nginx.conf.dpkg-old
over /etc/nginx/nginx.conf
, then instructing nginx
to restart.
From IRC:
`
From IRC:
<nejucomo> So the new config file *should* or *should not* include the line: "include /etc/nginx/site-enabled/*"
<Lil_red> to be clear, nginx.conf *should* have "include /etc/nginx/sites-enabled/*;", if it's running the newer version of nginx. The other changes should be reverted
We are going to rename both the current and dpkg
backup files with the suffix .TICKET_55
, then copy /etc/nginx/nginx.conf.dpkg-old
to /etc/nginx/nginx.conf
, then edit that new copy to include the line:
include /etc/nginx/sites-enabled/*;
Oops, I said "rename" but I meant "copy to".
BTW- /etc/nginx/nginx.conf
already includes this line:
include /etc/nginx/sites-enabled/*;
We believe this is correct and will leave it there.
This appeared to fix the outage, as of 2013-07-31 18:58 GMT -ish.
I believe the outage was from about:
I believe the outage was about from 2013-07-31 01:40:15 +0000 to around 2013-07-31 18:58:00 +0000.
The start time estimate is from the shell pipelines below. It may have started earlier... Not quite sure.
The end time estimate is from looking at my watch when Za and I restarted nginx
with the new config and then verified with our browsers that we could log in.
There may be a regression ticketed in: #59
ubuntu@ip-10-204-239-9:~$ cat /var/log/nginx/access.log{,.1} | sed 's/^[0-9.]* - - //' | sort | grep 502 | head -1
[31/Jul/2013:01:40:15 +0000] "GET /piwik/piwik.php?action_name=Least%20Authority&idsite=1&rec=1&r=357278&h=1&m=40&s=52&url=https%3A%2F%2Fleastauthority.com%2F&_id=538987ce8218b510&_idts=1375222655&_idvc=3&_idn=0&_refts=0&_viewts=1375232018&cookie=1&res=766x635 HTTP/1.1" 502 172 "https://leastauthority.com/" "Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0" "-"
ubuntu@ip-10-204-239-9:~$ zcat /var/log/nginx/access.log*.gz | sed 's/^[0-9.]* - - //' | sort | grep 502 | tail -1
[30/Jul/2013:00:20:46 +0000] "GET /piwik/piwik.php?action_name=Least%20Authority&idsite=1&rec=1&r=285027&h=1&m=21&s=21&url=https%3A%2F%2F54.224.104.157%2F&urlref=https%3A%2F%2F54.224.104.157%2Fcollect-email&_id=ed1a9e81aa35c666&_idts=1375123352&_idvc=3&_idn=0&_refts=0&_viewts=1375139639&pdf=0&qt=1&realp=1&wma=1&dir=0&fla=1&java=0&gears=0&ag=0&cookie=1&res=2560x1440>_ms=5721 HTTP/1.1" 200 61 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:21.0) Gecko/20100101 Firefox/21.0"
http://analytics.leastauthority.com/ gives a generic nginx welcome page.