danihodovic / celery-exporter

A Prometheus exporter for Celery metrics
MIT License
377 stars 83 forks source link

More detailed explanation #266

Closed Anghille closed 9 months ago

Anghille commented 9 months ago

Hi,

I am trying to setup a celery exporter (which uses RabbitMQ), but I only have 4 metrics beeing shown in my prometheus instance (celery_task_sent_created, celery_task_sent_total, celery_worker_up and celery_worker_tasks_active). I tried setting up the control enable_events, --events or the worker_send_task_events = Trueandtask_send_sent_event = Truein myceleryconfig.py` without any succes.

Do you have an idea on why this might happens?

danihodovic commented 9 months ago

Can you post a detailed log of the metrics you are seeing from the exporter?

Anghille commented 9 months ago

Right now, I just go the 2023-09-14 14:24:36.847 | INFO | src.http_server:start_http_server:66 - Started celery-exporter at host='0.0.0.0' on port='9808'. If you need more logs, can you tell me how to get them ? Cheers :)

Anghille commented 9 months ago

Nevermind found it:

As you can see, I only have some metrics updated.

2023-09-15 07:30:30.511 | DEBUG | src.exporter:track_worker_heartbeat:324 - Updated gauge='celery_worker_tasks_active' value='0' 2023-09-15 07:30:30.511 | DEBUG | src.exporter:track_worker_heartbeat:327 - Updated gauge='celery_worker_up' value='1' 2023-09-15 07:30:30.564 | DEBUG | src.exporter:track_worker_heartbeat:316 - Received event='worker-heartbeat' for worker='xxxx' 2023-09-15 07:30:30.564 | DEBUG | src.exporter:track_worker_heartbeat:324 - Updated gauge='celery_worker_tasks_active' value='0' 2023-09-15 07:30:30.565 | DEBUG | src.exporter:track_worker_heartbeat:327 - Updated gauge='celery_worker_up' value='1' 2023-09-15 07:30:32.513 | DEBUG | src.exporter:track_worker_heartbeat:316 - Received event='worker-heartbeat' for worker='xxxx' 2023-09-15 07:30:32.514 | DEBUG | src.exporter:track_worker_heartbeat:324 - Updated gauge='celery_worker_tasks_active' value='0' 2023-09-15 07:30:32.515 | DEBUG | src.exporter:track_worker_heartbeat:327 - Updated gauge='celery_worker_up' value='1' 2023-09-15 07:30:32.565 | DEBUG | src.exporter:track_worker_heartbeat:316 - Received event='worker-heartbeat' for worker='xxxx' 2023-09-15 07:30:32.566 | DEBUG | src.exporter:track_worker_heartbeat:324 - Updated gauge='celery_worker_tasks_active' value='0' 2023-09-15 07:30:32.567 | DEBUG | src.exporter:track_worker_heartbeat:327 - Updated gauge='celery_worker_up' value='1' 2023-09-15 07:30:34.515 | DEBUG | src.exporter:track_worker_heartbeat:316 - Received event='worker-heartbeat' for worker='xxxx' 2023-09-15 07:30:34.515 | DEBUG | src.exporter:track_worker_heartbeat:324 - Updated gauge='celery_worker_tasks_active' value='0' 2023-09-15 07:30:34.515 | DEBUG | src.exporter:track_worker_heartbeat:327 - Updated gauge='celery_worker_up' value='1' 2023-09-15 07:30:34.566 | DEBUG | src.exporter:track_worker_heartbeat:316 - Received event='worker-heartbeat' for worker='xxxx' 2023-09-15 07:30:34.566 | DEBUG | src.exporter:track_worker_heartbeat:324 - Updated gauge='celery_worker_tasks_active' value='0' 2023-09-15 07:30:34.567 | DEBUG | src.exporter:track_worker_heartbeat:327 - Updated gauge='celery_worker_up' value='1' 2023-09-15 07:30:36.518 | DEBUG | src.exporter:track_worker_heartbeat:316 - Received event='worker-heartbeat' for worker='xxxx' 2023-09-15 07:30:36.519 | DEBUG | src.exporter:track_worker_heartbeat:324 - Updated gauge='celery_worker_tasks_active' value='0' 2023-09-15 07:30:36.520 | DEBUG | src.exporter:track_worker_heartbeat:327 - Updated gauge='celery_worker_up' value='1' 2023-09-15 07:30:36.569 | DEBUG | src.exporter:track_worker_heartbeat:316 - Received event='worker-heartbeat' for worker='xxxx' 2023-09-15 07:30:36.570 | DEBUG | src.exporter:track_worker_heartbeat:324 - Updated gauge='celery_worker_tasks_active' value='0' 2023-09-15 07:30:36.570 | DEBUG | src.exporter:track_worker_heartbeat:327 - Updated gauge='celery_worker_up' value='1' 2023-09-15 07:30:38.522 | DEBUG | src.exporter:track_worker_heartbeat:316 - Received event='worker-heartbeat' for worker='xxxx'

Anghille commented 9 months ago

I figured it out :) I expected the exporter to get the metrics from every existing queues, but by reading your code, I figured I could setup the '-Q' parameter with my queue names and now other metrics are showing up :)

For reference : I added this line in my exporter deployment

spec:
  container:
    args: ["--accept-content", "json","--retry-interval", "1", "--log-level=DEBUG", "-Q", "my-queue-name"]
Anghille commented 9 months ago

I spoke too soon. Only my worker metrics are accessible. I cant seem to get the event metrics, even if I use -E on the Celery CLI and use the correct configs.

Anghille commented 9 months ago

After diving fully into the exporter code (forked and added log to it), I now suspect that the exporter is working as intended and that the problem comes from my Celery application. I will keep you posted if I find anything related to the exporter itself.

Thank you

danihodovic commented 9 months ago

Good luck!

Anghille commented 9 months ago

Hey. I still can't figure out how to get the metrics. My question is this : if I can see the gauge metrics from the exporter, but none of the counter event related metrics, does that mean my celery configuration is missing something ?

Can you give me an exemple on how you have setup your exporter and your celery configs / deployments so I can check what I might be missing ?

Cheers

danihodovic commented 9 months ago

Are you sure that your tasks are executed by a worker node? Can you confirm with the Celery results backends that's indeed the case?

Anghille commented 9 months ago

This is the case :) I have results showing up in logs, and are written to my db :(

danihodovic commented 9 months ago

Does the --queues option for celery-exporter match the queues for your worker?

https://github.com/danihodovic/celery-exporter/blob/master/src/cli.py#L102

Anghille commented 9 months ago

It does as well :) I tried specifying specific queues, or let it empty to let the exporter auto-detect things itself. I still have no task events whatsoever :(

Anghille commented 9 months ago

I have FINALLY figured it out. The issue comes from the use of the module celery-batches to compute multiples tasks at once. I will try to do a PR on their repository to fix this.

I tried your exporter with a normal celery job and it works like a charm :)

adinhodovic commented 9 months ago

Seems resolved!