tagomoris / fluent-plugin-secure-forward

Other
140 stars 30 forks source link

OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: unknown protocol #41

Closed machytkafitanalytics closed 8 years ago

machytkafitanalytics commented 8 years ago

Hi, after many tries with no result I would like to ask you about this error.

I run td-agent 0.12.20 tailing log files on several instances on DigitalOcean sending data via secure_forward to google-fluentd 1.5.8 running on one service instance on Google and here are feeded into google StackDriver.

For several weeks everything was working perfect. But on Wednesday (2016-07-06) we installed security updates on service instance on Google and after restarting it td-agents on DO are no longer able to create connection. I got still these error messages:

2016-07-08 15:44:12 +0000 fluent.debug: {"host":"xxxxx","address":"xxxxxx","port":24284,"message":"create tcp socket to node host=\"xxxxxx\" address=\"xxxxxx\" port=24284"} 2016-07-08 15:44:12 +0000 [debug]: trying to connect ssl session host="xxxxxx" address="xxxxx" port=24284 2016-07-08 15:44:12 +0000 fluent.debug: {"host":"xxxxxx","address":"xxxxxx","port":24284,"message":"trying to connect ssl session host=\"xxxxxx\" address=\"xxxxxx\" port=24284"} 2016-07-08 15:44:12 +0000 [warn]: failed to establish SSL connection error_class=OpenSSL::SSL::SSLError error=# host="xxxxxx" address="xxxxxx" port=24284 2016-07-08 15:44:12 +0000 fluent.warn: {"error_class":"OpenSSL::SSL::SSLError","error":"#","host":"xxxxxxx","address":"xxxxxx","port":24284,"message":"failed to establish SSL connection error_class=OpenSSL::SSL::SSLError error=# host=\"xxxxxx\" address=\"xxxxxx\" port=24284"} 2016-07-08 15:44:22 +0000 [debug]: SSL connection is not established until timemout host="xxxxxxx" port=24284 timeout=10

Secure forwarder on all sites in updated to latest version using embeded fluentd-gem. I can telnet port 24284 from DO to Google, I can ssh service instance on Google from all DO instances without problems under td-agent user. Any Ideas what to do?

tagomoris commented 8 years ago

Can you do tcpdump on the host on Google (Compute Engine)? It shows that your problem will be brought by networking, or code of plugin.

machytkafitanalytics commented 8 years ago

Hi, thanks for reply but can you please specify what shall I look for? I see repeated connection attempts in tcpdump, I see packets but nothing which looks like some error code etc...

tagomoris commented 8 years ago

Did you see some packets on tcp port 24284? (It should be captured as tcpdump -i dev_name dst port 24284) If so, some debug or trace logs should exist on in_secure_forward of google-fluentd. Can you see that?

machytkafitanalytics commented 8 years ago

In tcpdump -i .... I see these repeated blocks:

08:15:26.750567 IP xxxxxxxxx.35333 > my.target.google.instance.24284: Flags [S], seq 3859221755, win 29200, options [mss 1460,sackOK,TS val 176397104 ecr 0,nop,wscale 8], length 0 08:15:26.750623 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [S.], seq 596849254, ack 3859221756, win 28160, options [mss 1420,sackOK,TS val 58127313 ecr 176397104,nop,wscale 7], length 0 08:15:26.760454 IP xxxxxxxxx.35333 > my.target.google.instance.24284: Flags [.], ack 1, win 115, options [nop,nop,TS val 176397107 ecr 58127313], length 0 08:15:26.765403 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [P.], seq 1:44, ack 1, win 220, options [nop,nop,TS val 58127317 ecr 176397107], length 43 08:15:26.768763 IP xxxxxxxxx.35333 > my.target.google.instance.24284: Flags [P.], seq 1:518, ack 1, win 115, options [nop,nop,TS val 176397109 ecr 58127313], length 517 08:15:26.768793 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [.], ack 518, win 229, options [nop,nop,TS val 58127318 ecr 176397109], length 0 08:15:26.768903 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [R.], seq 44, ack 518, win 229, options [nop,nop,TS val 58127318 ecr 176397109], length 0 08:15:26.772350 IP xxxxxxxxx.35333 > my.target.google.instance.24284: Flags [.], ack 44, win 115, options [nop,nop,TS val 176397110 ecr 58127317], length 0 08:15:26.772372 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [R], seq 596849298, win 0, length 0

Those are connection attempts from DO instance. But nothing in google-fluentd log. Looks like attempt is refused on some lower level.

Btw - I created new Google instance and redirected traffic to it - all is working well. So this is not problem in your plugin. So I will close this issue. Thanks for your work on this plugin!

tagomoris commented 8 years ago

Okay, thank you to report the conclusion of your case!