catalyst-cooperative / pudl-usage-metrics

A dagster ETL for collecting and cleaning PUDL usage metrics.
MIT License
1 stars 0 forks source link

ETL superset usage metrics #179

Open bendnorman opened 2 weeks ago

bendnorman commented 2 weeks ago

ETL Superset logs so we can understand who is using it and how!

There are a few sources of superset information 1) the superset database and 2) the Cloud logs produced by Cloud Run. 3) the Cloud Monitoring Metrics of the Cloud Run instance.

I think the database will provide valuable information like users, queries and logs. This comment has some more information. The Cloud Logs will likely have everything in the logs table plus some additional Cloud Run operational information which could be helpful for monitoring the performance of specific operations. The Monitoring metrics information would be helpful for monitoring overall performance of the deployment. Monitoring metrics are currently retained for 6 weeks so we'd need to archive them.

I could see a few different phases for this work:

Phase I

Phase II

Phase II

### Next steps
* [ ] ...
bendnorman commented 2 weeks ago

Also, @e-belfer I connected the superset database to superset so you can explore the data.