logstash-plugins / logstash-input-sqs

Apache License 2.0
16 stars 40 forks source link

Logstash hangs with multiple sqs inputs #10

Closed jordansissel closed 9 years ago

jordansissel commented 9 years ago

(This issue was originally filed by @loganbhardy at https://github.com/elastic/logstash/issues/2884)


Logstash 1.4.2

Logstash seems stable with with a single SQS input but hangs after a short time if I add a two or more to my config. There is nothing written to the logs when this occurs. The cpu gets pegged at 100% and messages are no longer picked up from the SQS queue. I was told in passing by an engineer at Elasticon that lowering the number of threads on the SQS input could help but that had no effect. I've set threads as high as 8 with a single SQS input and things seemed stable. The problem only seems to occur if I add more than two SQS inputs. One other configuration of note is that I had to disable the SQS checksum validation in order to get around an issue there. (See issue #2190) I should also note that I am running two central logstash servers with this config. I have not yet tried this with Logstash 1.5.

Here is my logstash config.

input { sqs { region => "us-west-1" queue => "logstash-app1" access_key_id => "MYACCESSID" secret_access_key => "MYACCESSKEY" threads => 1 use_ssl => false } sqs { region => "us-west-1" queue => "logstash-app2" access_key_id => "MYACCESSID" secret_access_key => "MYACCESSKEY" threads => 1 use_ssl => false } sqs { region => "us-west-1" queue => "logstash-app3" access_key_id => "MYACCESSID" secret_access_key => "MYACCESSKEY" threads => 1 use_ssl => false } sqs { region => "us-west-1" queue => "logstash-app4" access_key_id => "MYACCESSID" secret_access_key => "MYACCESSKEY" threads => 1 use_ssl => false } } filter { if [type] == "app1" or "app2" or "app3" or "app4" { mutate { gsub => [ "message", "\x1B[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]", "" ] } grok { match => [ "message", "(?m)%{TIMESTAMP_ISO8601:timestamp}\s-\s%{LOGLEVEL:level}:\s[(?

JonathanSerafini commented 9 years ago

Hi Jordan, are you getting any errors within your logs/logstash.log ?

I'm currently running with 7 SQS queues ( 1 very busy w/ 50 threads, 1 busy w/ 20 threads, 5 quiet w/ 5 threads ) on c3.large logstash indexers w/ the cpu going from anywhere between 15% and 80% depending on load an autoscaling.

Early on with the setup I was having a bunch of problems, including 100% CPU usage until I noticed the Checksum errors that I chose to ignore ( https://github.com/JonathanSerafini/logstash-input-sqs/tree/sqs_checksum_validation ) as well as the fact that I hadn't configured my queue timeouts ( https://github.com/logstash-plugins/logstash-input-sqs/pull/12 ).

With both those changes, I've yet to hit any issues. If your problem's different, I'd love to hear more about it.

ph commented 9 years ago

The hang issue is the same as https://github.com/logstash-plugins/logstash-input-sqs/issues/5

It should be fixed in #18 since it uses the latest aws-sdk library that include the memory leak fix.