cargo-bins / cargo-quickinstall

pre-compiled binary packages for `cargo install`
Apache License 2.0
218 stars 10 forks source link

think about adding monitoring to the stats server #278

Open alsuren opened 2 months ago

alsuren commented 2 months ago
          I think we could use tracing + + tracing-subscriber + tracing-appender for logging.

tracing is like the crate log, while tracing-appender provides non-blocking writer implementation (to prevent blocking the web server), and tracing-subscriber provides the logging implementation.

tracing-subscriber could also print out json as well to make it machine readable.

_Originally posted by @NobodyXu in https://github.com/cargo-bins/cargo-quickinstall/pull/165#discussion_r1749180726_

It might be that we run with it as it is for a while and decide that it's fine as it is, but it feels like everyone is using the tracing crate these days, so I would be interested in seeing how it works for us.

alsuren commented 1 month ago

Leaving a note here because it feels related, and probably doesn't needs its own issue:

fly.io has pretty good monitoring out of the box (prometheus + grafana). I noticed that our response times were pretty bad (median never below 100ms, p99 up to 500ms at some timescales).

I realised that the stats server is deployed in fly.io's lhr region (named after London Heathrow - they all seem to be named after the nearest airport or something?). This was fine on the old influxdb cloud instance, but the new one is in us-east-1.

I ran the following commands:

fly scale count 1 --region=iad
fly scale count 0 --region=lhr

This was based on this forum post: https://community.fly.io/t/change-region-for-an-app/18888. iad is the closest region to us-east-1 - https://fly.io/docs/reference/regions/

Looking at the graphs again, this has brought things more under control (15ms to 200ms).

Screenshot 2024-09-22 at 16 24 54

-- https://fly-metrics.net/d/fly-app/fly-app?from=1727002324300&to=1727031124300&var-app=cargo-quickinstall-stats-server&orgId=179329&viewPanel=13 (shout if you want access)

I'll take another glance at it later today to make sure it stays that way.