2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
105 stars 64 forks source link

Store Monthly, Weekly, and Daily active users for each cluster and hub in a persistent space #3917

Open choldgraf opened 7 months ago

choldgraf commented 7 months ago

We use Prometheus to generate data about hubs in a way that can be accessed via Grafana.

There are two challenges with this:

  1. If a cluster is decommissioned, then we lose the historical data that it generated. This makes it unreliable to count on this data for long-term trend reporting.
  2. Accessing the data requires understanding PromQL and manually generating a query. This is complex to maintain.

For metrics that we find particularly useful in week-to-week actions, it would help if we stored them in an easily accessible place that persisted over time.

Here are some metrics that would be useful, others can chime in if there's something I'm missing:

  1. Monthly Active Users per hub
  2. Weekly Active Users per hub
  3. Daily Active Users per hub
  4. Hub name / ID
  5. Cluster name / ID
  6. Ideally, any field that we can use to link a hub to a community as they are stored in AirTable.

Related issues

We've implemented the ability to read Prometheus data and visualize it via these two PRs:

However this still depends on each cluster providing this historical data, and it will be lost if a cluster is decommissioned, so it does not resolve this issue.

choldgraf commented 6 months ago

I think we can consider this one completed now that these two PRs have been merged:

yuvipanda commented 6 months ago

I'm going to keep this one open, as I see it primarily as keeping the data long term. Right now, this data only goes back 1yr, and it will disappear once we decommission that cluster.

choldgraf commented 6 months ago

Ah cool - I will add that context to the issue body.