Closed machytkafitanalytics closed 8 years ago
Can you do tcpdump
on the host on Google (Compute Engine)?
It shows that your problem will be brought by networking, or code of plugin.
Hi, thanks for reply but can you please specify what shall I look for? I see repeated connection attempts in tcpdump, I see packets but nothing which looks like some error code etc...
Did you see some packets on tcp port 24284? (It should be captured as tcpdump -i dev_name dst port 24284
)
If so, some debug or trace logs should exist on in_secure_forward of google-fluentd. Can you see that?
In tcpdump -i .... I see these repeated blocks:
08:15:26.750567 IP xxxxxxxxx.35333 > my.target.google.instance.24284: Flags [S], seq 3859221755, win 29200, options [mss 1460,sackOK,TS val 176397104 ecr 0,nop,wscale 8], length 0 08:15:26.750623 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [S.], seq 596849254, ack 3859221756, win 28160, options [mss 1420,sackOK,TS val 58127313 ecr 176397104,nop,wscale 7], length 0 08:15:26.760454 IP xxxxxxxxx.35333 > my.target.google.instance.24284: Flags [.], ack 1, win 115, options [nop,nop,TS val 176397107 ecr 58127313], length 0 08:15:26.765403 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [P.], seq 1:44, ack 1, win 220, options [nop,nop,TS val 58127317 ecr 176397107], length 43 08:15:26.768763 IP xxxxxxxxx.35333 > my.target.google.instance.24284: Flags [P.], seq 1:518, ack 1, win 115, options [nop,nop,TS val 176397109 ecr 58127313], length 517 08:15:26.768793 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [.], ack 518, win 229, options [nop,nop,TS val 58127318 ecr 176397109], length 0 08:15:26.768903 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [R.], seq 44, ack 518, win 229, options [nop,nop,TS val 58127318 ecr 176397109], length 0 08:15:26.772350 IP xxxxxxxxx.35333 > my.target.google.instance.24284: Flags [.], ack 44, win 115, options [nop,nop,TS val 176397110 ecr 58127317], length 0 08:15:26.772372 IP my.target.google.instance.24284 > xxxxxxxxx.35333: Flags [R], seq 596849298, win 0, length 0
Those are connection attempts from DO instance. But nothing in google-fluentd log. Looks like attempt is refused on some lower level.
Btw - I created new Google instance and redirected traffic to it - all is working well. So this is not problem in your plugin. So I will close this issue. Thanks for your work on this plugin!
Okay, thank you to report the conclusion of your case!
Hi, after many tries with no result I would like to ask you about this error.
I run td-agent 0.12.20 tailing log files on several instances on DigitalOcean sending data via secure_forward to google-fluentd 1.5.8 running on one service instance on Google and here are feeded into google StackDriver.
For several weeks everything was working perfect. But on Wednesday (2016-07-06) we installed security updates on service instance on Google and after restarting it td-agents on DO are no longer able to create connection. I got still these error messages:
2016-07-08 15:44:12 +0000 fluent.debug: {"host":"xxxxx","address":"xxxxxx","port":24284,"message":"create tcp socket to node host=\"xxxxxx\" address=\"xxxxxx\" port=24284"} 2016-07-08 15:44:12 +0000 [debug]: trying to connect ssl session host="xxxxxx" address="xxxxx" port=24284 2016-07-08 15:44:12 +0000 fluent.debug: {"host":"xxxxxx","address":"xxxxxx","port":24284,"message":"trying to connect ssl session host=\"xxxxxx\" address=\"xxxxxx\" port=24284"} 2016-07-08 15:44:12 +0000 [warn]: failed to establish SSL connection error_class=OpenSSL::SSL::SSLError error=# host="xxxxxx" address="xxxxxx" port=24284
2016-07-08 15:44:12 +0000 fluent.warn: {"error_class":"OpenSSL::SSL::SSLError","error":"#","host":"xxxxxxx","address":"xxxxxx","port":24284,"message":"failed to establish SSL connection error_class=OpenSSL::SSL::SSLError error=# host=\"xxxxxx\" address=\"xxxxxx\" port=24284"}
2016-07-08 15:44:22 +0000 [debug]: SSL connection is not established until timemout host="xxxxxxx" port=24284 timeout=10
Secure forwarder on all sites in updated to latest version using embeded fluentd-gem. I can telnet port 24284 from DO to Google, I can ssh service instance on Google from all DO instances without problems under td-agent user. Any Ideas what to do?