hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.39k stars 4.43k forks source link

Add information in Consul logs for Nomad service checks #7521

Open scalp42 opened 4 years ago

scalp42 commented 4 years ago

Hi folks,

When Nomad service registered in Consu fails, the Consul logs lack information regarding the service itself:

2020-03-26T20:43:43.067Z [WARN]  agent: Check is now critical: check=_nomad-check-0d041aa75f1c0c212e9b037abd9500a95466fd44

2020-03-26T20:43:43.408Z [WARN]  agent: Check is now critical: check=_nomad-check-94eeeeb2d63492142d6dd8d5cbf7e260e74cceb8

Unfortunately in this scenario, we have no idea which service/task is failing. Would it be possible to add the extra information? I believe it would help a lot with debugging in general.

We noticed that when a service is healthy, we do get a bit of extra information that includes the job and task name:

2020-03-26T20:43:45.206Z [INFO]  agent: Synced service: service=_nomad-task-30a5f647-7330-c6f3-b145-9921b22f595b-sidekiq-queue-priority-platform-sidekiq-http

Let us know if there's a way to map those generic _nomad-check-UUID to an actual service.

Thanks in advance!

jsosulska commented 4 years ago

Hi @scalp42

Thanks for bringing this up. I'm adding a new consul-nomad tag to have the teams discuss this and see where to go from here.

We'll post any updates here! Best, Jono

yuranich commented 3 years ago

Hi! Any update on that issue? It would be great to see the more informative output here, indeed.

nandha4083 commented 2 years ago

Hi Team, I am also facing these warnings. Any information will be great.

$ sudo journalctl -u consul -f
Feb 24 07:30:47 ip-172-16-0-180 consul[5513]: 2022-02-24T07:30:47.641Z [WARN]  agent: Check is now critical: check=_nomad-check-40cb96b8d5f0ac5bf3cb66480ffed5e8f9089fe8
Feb 24 07:31:02 ip-172-16-0-180 consul[5513]: 2022-02-24T07:31:02.642Z [WARN]  agent: Check is now critical: check=_nomad-check-40cb96b8d5f0ac5bf3cb66480ffed5e8f9089fe8
Feb 24 07:31:17 ip-172-16-0-180 consul[5513]: 2022-02-24T07:31:17.643Z [WARN]  agent: Check is now critical: check=_nomad-check-40cb96b8d5f0ac5bf3cb66480ffed5e8f9089fe8
kisunji commented 2 years ago

Although a service has reference of its health checks, a check does not know about its service when these logs are emitted. It's not impossible to pass the service ID through but it seems like non-trivial effort.

We could accept community contributions for this feature (or can discuss more detail with anyone willing to try this) but our team won't be prioritizing this at the moment.

nathanpalmer commented 3 months ago

Is there at-least a way to lookup what this check is for through nomad or the nomad api?