fedora-infra / datagrepper

HTTP API for datanommer and the fedmsg bus
https://apps.fedoraproject.org/datagrepper/
GNU General Public License v2.0
43 stars 34 forks source link

datagrepper stats - DB size etc. #211

Open abitrolly opened 6 years ago

abitrolly commented 6 years ago

What is the number 96,756,600 on the front page https://apps.fedoraproject.org/datagrepper/ ?

https://github.com/fedora-infra/datagrepper/blob/3a34025603a93f81ccce4e02d56a422bc11f27f5/datagrepper/templates/index.html#L70-L74

total of what? Helpful to have a tooltip that makes this explained.


And now the real question - is it possible to expose stats/metrics about fedmsg DB? Like size growth over time and current size per topic and per rate of messages per second. Maybe delays in delivering messages to subscribers and count of dropped messages per subscriber (detect dead) and topic (popularity). To estimate if this architecture is good based on real working example. I mean to estimate if this stack is better than more heavyweight Kafka/Zookeeper/...

pypingou commented 6 years ago

The number on the front page is the total number of fedmsg messages recorded.

Like size growth over time and current size per topic and per rate of messages per second.

This would be doable, and you can do it yourself with the DB dump that is at: https://infrastructure.fedoraproject.org/infra/db-dumps/datanommer.dump.xz (refreshed daily) just beware that this DB is pretty huge so prepare some disk space :)

pypingou commented 6 years ago

Maybe delays in delivering messages to subscribers and count of dropped messages per subscriber (detect dead) and topic (popularity)

Delays in delivering and dropped messages won't be computable as the messages are sent in a pub-sub mode (ie: fire and forget), there is no central broker. If nobody listens to your service, your service won't resend the messages.

abitrolly commented 6 years ago

DB is pretty huge so prepare some disk space :)

21GB .xz OMG. =)

Delays in delivering and dropped messages won't be computable as the messages are sent in a pub-sub mode (ie: fire and forget), there is no central broker. If nobody listens to your service, your service won't resend the messages.

That explains a lot. For me this paragraph should be present on the front page http://www.fedmsg.com/en/stable/ right after usage example. So, the datagrepper is the thing to restore the gaps if service goes down for some reason. Seems quite critical.

pypingou commented 6 years ago

I seem to remember this being present at one point but it seems it is no longer :(