openzipkin / zipkin-aws

Reporters and collectors for use in Amazon's cloud
Apache License 2.0
69 stars 34 forks source link

suddenly not processing kinesis stream #176

Closed slikk66 closed 4 years ago

slikk66 commented 4 years ago

Hi, looking for any tips. We've had our zipkin aws system working for a while now. The containers send data to the kinesis stream, and the service reads the stream and stores in elasticache.

We have setup a kinesis consumer off to the side and can see the messages coming in. The metrics for the stream show constant "incoming records" metric.

The messages in the stream look correct, we even reverted to a previous build where the steam was working to double check the message format isnt the issue.

The log shows this over and over:

2020-06-05 16:50:36.251 INFO 1 --- [e-service-api-0] c.a.s.k.c.l.w.Worker : Current stream shard assignments: shardId-000000000000
2020-06-05 16:50:36.252 INFO 1 --- [e-service-api-0] c.a.s.k.c.l.w.Worker : Sleeping ...
2020-06-05 16:49:35.227 INFO 1 --- [e-service-api-0] c.a.s.k.c.l.w.Worker : Current stream shard assignments: shardId-000000000000
2020-06-05 16:49:35.227 INFO 1 --- [e-service-api-0] c.a.s.k.c.l.w.Worker : Sleeping ...

elasticsearch has space on it, we just see the data incoming stop.

i've re-launched the zipkin container, there are no obvious messages in the log.

what could be the issue? any tips on how to debug this?

thanks!

slikk66 commented 4 years ago

we also tried updating to latest zipkin-aws container: https://hub.docker.com/layers/openzipkin/zipkin-aws/latest/images/sha256-1f7131bfa484c7768dd824ecc8fea84d4405d6aacebb48689cd3d15c4179e80e

slikk66 commented 4 years ago

Figured this out: putting here for others..

So the issue was elasticsearch was out of "shards" on our single node.

The collector didn't surface this error, but I found it on our other system trying to ship logs from CW to elasticsearch.

Basically, simple fix if you're using a single node ES is to set the shard count to 1 per index.. then you can have up to 1,000 indexes on a node.

PUT _template/smaller-index-template
{
  "index_patterns": ["cwl-*", "zipkin-*"],
  "settings": {
    "number_of_shards": 1
  }
}
codefromthecrypt commented 4 years ago

thanks for replying with the answer!

On Sat, Jun 6, 2020 at 2:48 AM Dan B notifications@github.com wrote:

Figured this out: putting here for others..

So the issue was elasticsearch was out of "shards" on our single node.

The collector didn't surface this error, but I found it on our other system trying to ship logs from CW to elasticsearch.

Basically, simple fix if you're using a single node ES is to set the shard count to 1 per index.. then you can have up to 1,000 indexes on a node.

PUT _template/smaller-index-template { "index_patterns": ["cwl-", "zipkin-"], "settings": { "number_of_shards": 1 } }

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openzipkin/zipkin-aws/issues/176#issuecomment-639705404, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPVV2IEBUTI3GGWESD5E3RVE4XRANCNFSM4NUH77IA .