processone / ejabberd

Robust, Ubiquitous and Massively Scalable Messaging Platform (XMPP, MQTT, SIP Server)
https://www.process-one.net/en/ejabberd/
Other
6k stars 1.5k forks source link

Prometheus support #4238

Closed pouriya closed 2 days ago

pouriya commented 1 week ago

Example config

# ...
  -
    port: 5280
    ip: "::"
    module: ejabberd_http
    request_handlers:
      /metrics: mod_prometheus
# ...

modules:
  # ...
    mod_prometheus:
    mnesia: true
    vm:
      memory: true
      system_info: true
      statistics: false
      distribution: false
      microstate_accounting: false
    hooks:
      # Histogram for a hook:
      - hook: user_send_packet
        type: histogram
        help: "Handling of sent messages duration in millisecond"
        stanza_label: true
        host_label: true
      # Counter for a hook:
      - hook: user_send_packet
        type: counter
        help: "Number of sent messages"
      #Histograms only for some callbacks of a hook:
      - hook: user_send_packet
        type: histogram
        host_label: true
        collect:
          - module: mod_carboncopy
            function: user_send_packet
            help: "Handling of carbon copied messages in millisecond"
          - module: mod_mam
            function: user_send_packet
            help: "Handling of MAM messages in millisecond"
            buckets:
              - 10
              - 100
              - 750
              - 1000
              - 1500

Output example (After sending some messages)

curl http://127.0.0.1:5280/metrics
...
# TYPE user_send_packet_mod_mam_duration_milliseconds histogram
# HELP user_send_packet_mod_mam_duration_milliseconds Handling of MAM messages in millisecond
user_send_packet_mod_mam_duration_milliseconds_bucket{host="localhost",le="10"} 43
user_send_packet_mod_mam_duration_milliseconds_bucket{host="localhost",le="100"} 43
user_send_packet_mod_mam_duration_milliseconds_bucket{host="localhost",le="750"} 43
user_send_packet_mod_mam_duration_milliseconds_bucket{host="localhost",le="1000"} 43
user_send_packet_mod_mam_duration_milliseconds_bucket{host="localhost",le="1500"} 43
user_send_packet_mod_mam_duration_milliseconds_bucket{host="localhost",le="+Inf"} 43
user_send_packet_mod_mam_duration_milliseconds_count{host="localhost"} 43
user_send_packet_mod_mam_duration_milliseconds_sum{host="localhost"} 9.0e-6
# TYPE user_send_packet_mod_carboncopy_duration_milliseconds histogram
# HELP user_send_packet_mod_carboncopy_duration_milliseconds Handling of carbon copied messages in millisecond
user_send_packet_mod_carboncopy_duration_milliseconds_bucket{host="localhost",le="1"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_bucket{host="localhost",le="10"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_bucket{host="localhost",le="100"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_bucket{host="localhost",le="500"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_bucket{host="localhost",le="750"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_bucket{host="localhost",le="1000"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_bucket{host="localhost",le="3000"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_bucket{host="localhost",le="5000"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_bucket{host="localhost",le="+Inf"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_count{host="localhost"} 43
user_send_packet_mod_carboncopy_duration_milliseconds_sum{host="localhost"} 3.0e-6
# TYPE user_send_packet_duration_milliseconds histogram
# HELP user_send_packet_duration_milliseconds Handling of sent messages duration in millisecond
user_send_packet_duration_milliseconds_bucket{stanza="presence",host="localhost",le="1"} 1
user_send_packet_duration_milliseconds_bucket{stanza="presence",host="localhost",le="10"} 1
user_send_packet_duration_milliseconds_bucket{stanza="presence",host="localhost",le="100"} 1
user_send_packet_duration_milliseconds_bucket{stanza="presence",host="localhost",le="500"} 1
user_send_packet_duration_milliseconds_bucket{stanza="presence",host="localhost",le="750"} 1
user_send_packet_duration_milliseconds_bucket{stanza="presence",host="localhost",le="1000"} 1
user_send_packet_duration_milliseconds_bucket{stanza="presence",host="localhost",le="3000"} 1
user_send_packet_duration_milliseconds_bucket{stanza="presence",host="localhost",le="5000"} 1
user_send_packet_duration_milliseconds_bucket{stanza="presence",host="localhost",le="+Inf"} 1
user_send_packet_duration_milliseconds_count{stanza="presence",host="localhost"} 1
user_send_packet_duration_milliseconds_sum{stanza="presence",host="localhost"} 0.0
user_send_packet_duration_milliseconds_bucket{stanza="message",host="localhost",le="1"} 35
user_send_packet_duration_milliseconds_bucket{stanza="message",host="localhost",le="10"} 35
user_send_packet_duration_milliseconds_bucket{stanza="message",host="localhost",le="100"} 35
user_send_packet_duration_milliseconds_bucket{stanza="message",host="localhost",le="500"} 35
user_send_packet_duration_milliseconds_bucket{stanza="message",host="localhost",le="750"} 35
user_send_packet_duration_milliseconds_bucket{stanza="message",host="localhost",le="1000"} 35
user_send_packet_duration_milliseconds_bucket{stanza="message",host="localhost",le="3000"} 35
user_send_packet_duration_milliseconds_bucket{stanza="message",host="localhost",le="5000"} 35
user_send_packet_duration_milliseconds_bucket{stanza="message",host="localhost",le="+Inf"} 35
user_send_packet_duration_milliseconds_count{stanza="message",host="localhost"} 35
user_send_packet_duration_milliseconds_sum{stanza="message",host="localhost"} 1.3e-5
user_send_packet_duration_milliseconds_bucket{stanza="iq",host="localhost",le="1"} 7
user_send_packet_duration_milliseconds_bucket{stanza="iq",host="localhost",le="10"} 7
user_send_packet_duration_milliseconds_bucket{stanza="iq",host="localhost",le="100"} 7
user_send_packet_duration_milliseconds_bucket{stanza="iq",host="localhost",le="500"} 7
user_send_packet_duration_milliseconds_bucket{stanza="iq",host="localhost",le="750"} 7
user_send_packet_duration_milliseconds_bucket{stanza="iq",host="localhost",le="1000"} 7
user_send_packet_duration_milliseconds_bucket{stanza="iq",host="localhost",le="3000"} 7
user_send_packet_duration_milliseconds_bucket{stanza="iq",host="localhost",le="5000"} 7
user_send_packet_duration_milliseconds_bucket{stanza="iq",host="localhost",le="+Inf"} 7
user_send_packet_duration_milliseconds_count{stanza="iq",host="localhost"} 7
user_send_packet_duration_milliseconds_sum{stanza="iq",host="localhost"} 0.0
# TYPE user_send_packet_total counter
# HELP user_send_packet_total Number of sent messages
user_send_packet_total 43

Prometheus screenshot

Screenshot 2024-06-25 at 22-13-07 Prometheus Time Series Collection and Processing Server

pouriya commented 1 week ago

I will fix dialyzer etc after review.

pouriya commented 6 days ago

I'm going to change subscriber API and leave run and run_fold untouched.

prefiks commented 6 days ago

I think you could even not need any changes in ejabberd_hooks, you will need to add two hooks with low and high priority, to get notified at start and of hook, ejabberd_hooks:add/remove accept closures, so you could generate required function on demand. Your function can return 'EXIT' atom to make output from it be ignored. Only issue will be arrity of those function, as you will not receive hook arguments as list, but as multiple arguments in those functions.

pouriya commented 6 days ago

you will need to add two hooks with low and high priority, to get notified at start and of hook.

@prefiks Yes. But what if we disable the module (M:F) we're monitoring? There is no way to know that ejabberd_hooks called or did not call specific module M:F. With these changes, If a hook has subscriber(s) they will know exactly what M:F ejabberd_hooks runs.

pouriya commented 6 days ago

I force pushed changes and left original run and run_fold untouched.

pouriya commented 6 days ago

We could also have a configuration parameter to set module parameters as labels. for example:

modules:
  # ...
    mod_prometheus:
    hooks:
      # Histogram for a hook:
      - hook: user_send_packet
        type: histogram
        help: "Handling of sent messages duration in millisecond"
        collect: modules   # HERE

Output:

...
user_send_packet_duration_milliseconds_bucket{module="mod_mam",le="10"} 43
user_send_packet_duration_milliseconds_bucket{module="mod_carboncopy",le="10"} 43
user_send_packet_duration_milliseconds_bucket{module="mod_foo",le="10"} 43
user_send_packet_duration_milliseconds_bucket{module="mod_bar",le="10"} 43
...

Any idea?

pouriya commented 6 days ago

Seems like don't have Dialyzer issue anymore :-)

Neustradamus commented 6 days ago

@pouriya: Nice PR, good job with ejabberd team!

mremond commented 3 days ago

I think this should be added to ejabberd-contribs repository: https://github.com/processone/ejabberd-contrib

pouriya commented 3 days ago

@mremond

I think this should be added to ejabberd-contribs

  1. We need hook's subscriber API for this module.
  2. We need rebar3 to compile prometheus & its deps. Is it possible to add it ejabberd-contrib and compile it via rebar3? I think ejabberd compiles them directly via compiler app.
badlop commented 3 days ago

Yes :) I have almost ready the fixes in ejabberd that will allow this:

I'll complete testing tomorrow and show how to accomplish it.

pouriya commented 3 days ago

@badlop @mremond So we're ready to merge hook subscriber commit here. Am I right?

badlop commented 3 days ago

So we're ready to merge hook subscriber commit here. Am I right?

Mostly yes :) Do you still have some improvement or fix planned for ejabberd_hooks.erl ?

pouriya commented 3 days ago

@badlop I don't think so... I'll see what can be done before force push. on it.

pouriya commented 3 days ago

@badlop force pushed just ejabberd_hooks changes. When we're ready to accept Prometheus in ejabberd-contrib?

coveralls commented 3 days ago

Coverage Status

coverage: 32.064% (-0.07%) from 32.13% when pulling e5ed0bcc2dbab2e5dcba81ce364ecd1e00b9e409 on pouriya:prometheus into 3124644315d028909552873d28c58cec2cfa9a8d on processone:master.

badlop commented 2 days ago

I've merged this PR with the "hooks subscription" feature.

And I've added a new directory in ejabberd-contrib with everything prepared for your mod_prometheus: https://github.com/processone/ejabberd-contrib/tree/master/mod_prometheus

Now you can create a directory src/ there, copy your *.erl files, fix anything you want in the documentation, example files... and submit a PR

pouriya commented 2 days ago

@badlop Thanks!