driskell / log-courier

The Log Courier Suite is a set of lightweight tools created to ship and process log files speedily and securely, with low resource usage, to Elasticsearch or Logstash instances.
Other
419 stars 107 forks source link

log-courier 2.x: TLS mess with recognizing certificate CN's #289

Closed sysmonk closed 8 years ago

sysmonk commented 8 years ago

There's some mess with recognising TLS CN's in log-courier 2x.

# /usr/local/bin/log-courier-new -config /etc/log-courier.dfw.conf
2016/03/14 16:24:33.826802 Log Courier version 2.0.0-beta1 pipeline starting
2016/03/14 16:24:33.827396 [Loadbalance] Initialised new endpoint: logstash1.example.com:9001
2016/03/14 16:24:33.827668 [Loadbalance] Initialised new endpoint: logstash2.example.com:9001
2016/03/14 16:24:33.828006 [Loadbalance] Initialised new endpoint: logstash3.example.com:9001
2016/03/14 16:24:33.828470 [Loadbalance] Initialised new endpoint: logstash4.example.com:9001
2016/03/14 16:24:33.829033 Loading registrar data from /var/run//.log-courier
2016/03/14 16:24:33.829977 Pipeline ready
2016/03/14 16:24:33.832126 Skipping file (older than dead time of 1h0m0s): /tmp/test.log
2016/03/14 16:24:33.832272 Skipping file (older than dead time of 1h0m0s): /tmp/test2.log
2016/03/14 16:24:33.832458 Skipping file (older than dead time of 1h0m0s): /var/log/mail.log
2016/03/14 16:24:33.832549 Resuming harvester on a previously harvested file: /var/log/auth.log
2016/03/14 16:24:33.832635 Skipping file (older than dead time of 1h0m0s): /var/log/messages
2016/03/14 16:24:33.832716 Skipping file (older than dead time of 1h0m0s): /var/log/kern.log
2016/03/14 16:24:33.832797 Skipping file (older than dead time of 1h0m0s): /var/log/debug
2016/03/14 16:24:33.832824 [logstash1.example.com:9001] Attempting to connect to 192.168.0.142:9001 (logstash1.example.com)
2016/03/14 16:24:33.832874 Resuming harvester on a previously harvested file: /var/log/daemon.log
2016/03/14 16:24:33.832970 Resuming harvester on a previously harvested file: /var/log/firewall-noise.log
2016/03/14 16:24:33.833160 Started harvester at position 21838 (requested 21838): /var/log/mail.log
2016/03/14 16:24:33.833184 Started harvester at position 160097 (requested 160097): /var/log/firewall-noise.log
2016/03/14 16:24:33.833395 Started harvester at position 1764646 (requested 1764646): /var/log/auth.log
2016/03/14 16:24:33.833508 Started harvester at position 1746 (requested 1746): /var/log/messages
2016/03/14 16:24:33.833674 Started harvester at position 400 (requested 400): /var/log/kern.log
2016/03/14 16:24:33.833786 Started harvester at position 0 (requested 0): /var/log/debug
2016/03/14 16:24:33.833893 Started harvester at position 45203 (requested 45203): /var/log/daemon.log
2016/03/14 16:24:33.833971 Started harvester at position 2010 (requested 2010): /tmp/test.log
2016/03/14 16:24:33.834096 Started harvester at position 1170 (requested 1170): /tmp/test2.log
2016/03/14 16:24:33.834441 [logstash4.example.com:9001] Attempting to connect to 192.168.0.58:9001 (logstash4.example.com)
2016/03/14 16:24:33.834532 [logstash3.example.com:9001] Attempting to connect to 192.168.0.67:9001 (logstash3.example.com)
2016/03/14 16:24:33.834633 [logstash2.example.com:9001] Attempting to connect to 192.168.0.177:9001 (logstash2.example.com)
2016/03/14 16:24:33.849507 [logstash4.example.com:9001] Transport error, reconnecting: TLS Handshake failure with 192.168.0.58:9001 (logstash4.example.com): x509: certificate is valid for logstash4.example.com, not logstash2.example.com
2016/03/14 16:24:33.849744 [logstash4.example.com:9001] Marking endpoint as failed
2016/03/14 16:24:33.851364 0 payloads held for resend
2016/03/14 16:24:33.860108 [logstash3.example.com:9001] Transport error, reconnecting: TLS Handshake failure with 192.168.0.67:9001 (logstash3.example.com): x509: certificate is valid for logstash3.example.com, not logstash2.example.com
2016/03/14 16:24:33.860803 [logstash3.example.com:9001] Marking endpoint as failed
2016/03/14 16:24:33.861380 0 payloads held for resend
2016/03/14 16:24:33.895997 [logstash1.example.com:9001] Transport error, reconnecting: TLS Handshake failure with 192.168.0.142:9001 (logstash1.example.com): x509: certificate is valid for logstash1.example.com, not logst
ash2.example.com
2016/03/14 16:24:33.896944 [logstash1.example.com:9001] Marking endpoint as failed
2016/03/14 16:24:33.897483 0 payloads held for resend
2016/03/14 16:24:33.914224 [logstash2.example.com:9001] Connected to 192.168.0.177:9001 (logstash2.example.com)
2016/03/14 16:24:33.915252 [logstash2.example.com:9001] Send is now ready, awaiting new events
2016/03/14 16:24:33.915654 [logstash2.example.com:9001] Starting keepalive timeout

But the certs are good (and works fine in 1.x):

# for I in `seq 1 4`; do echo 'test' | openssl s_client -connect logstash${I}.example.com:9001 2>/dev/null | grep -A2 'Certificate chain'; done
Certificate chain
 0 s:/C=US/CN=logstash1.example.com
   i:/C=US/ST=California/L=San Francisco/O=EXAMPLE/OU=logstash/CN=logstash
Certificate chain
 0 s:/C=US/CN=logstash2.example.com
   i:/C=US/ST=California/L=San Francisco/O=EXAMPLE/OU=logstash/CN=logstash
Certificate chain
 0 s:/C=US/CN=logstash3.example.com
   i:/C=US/ST=California/L=San Francisco/O=EXAMPLE/OU=logstash/CN=logstash
Certificate chain
 0 s:/C=US/CN=logstash4.example.com
   i:/C=US/ST=California/L=San Francisco/O=EXAMPLE/OU=logstash/CN=logstash

If i change the server list to only one server (i.e. logstash4 ) it connects properly and no errors with certs.

Config same as in #287 but with "method": "load-balance"

driskell commented 8 years ago

Thanks!

I've been using wildcard or same key. Looks like something config wise is getting shared in a bad way.