storacha-network / project-tracking

🐾 Used as central/default repo for project management, backlog, etc.
0 stars 0 forks source link

Business metrics/KPIs #67

Closed reidlw closed 1 month ago

reidlw commented 2 months ago

Tasks:

MF416 commented 2 months ago

Worth specifying how often these need to be refreshed? In the context of customer / business data, weekly granularity is probably sufficient for now

reidlw commented 2 months ago

@MF416 this should almost all be real-time data

heyjay44 commented 2 months ago

P0 metrics issue: https://github.com/w3s-project/nonpublic/issues/6

prodalex commented 2 months ago

P0 KPI Requirements

Goal

Understand the cost and revenue drivers from a business perspective as well as the ability to report to investors:

Requirements

Old web3.storage (once-off analysis)

W3up Weekly Reports

Users

Usage

Requests

W3up Monthly Reports

Revenue

Non-Functional Requirements

Out of Scope in P0

prodalex commented 2 months ago

@travis Pls review!

travis commented 2 months ago

Thanks @prodalex! I copied this to a Google Doc to make iteration easier - left you a bunch of questions there:

https://docs.google.com/document/d/1jK-FwpO_dWVx2oCHLwJT67yvZM6ZnkfTKWMb2LqE-gM/edit

travis commented 2 months ago

@MF416 this should almost all be real-time data

@reidlw I don't think this is accurate! it sounds like the biz folks don't need more than weekly granularity for now, and it sounds like many of these stats will be difficult to get in "Real time" - especially the Stripe data and anything that depends on "badbits".

travis commented 2 months ago

ok I've distilled the discussion in https://docs.google.com/document/d/1jK-FwpO_dWVx2oCHLwJT67yvZM6ZnkfTKWMb2LqE-gM/edit down to the following tasks:

  1. one-off report for old system
    1. Total monthly egress (last 6 months)
    2. Total data stored
    3. Top 10 customers (storage, egress)
    4. Total number of customers
  2. set up a weekly job that pulls data from Stripe and dumps it to Athena
  3. figure out how to get badbits list into a place Athena can read
  4. prototype "active user" report and plan further iteration
  5. support @prodalex, @MF416 and @heyjay44 in putting together the various Athena queries and Quicksight dashboards

(1) is going to be a one-off - I can probably handle it but it would probably be faster for @alanshaw or someone else who's more familiar with where all the data there ends up and how it gets there (2) sounds like it should be pretty straightforward - need to get details from @alanshaw on what he currently does and set up a cronjob somewhere to do it weekly (https://docs.google.com/document/d/1jK-FwpO_dWVx2oCHLwJT67yvZM6ZnkfTKWMb2LqE-gM/edit?disco=AAABK3hDqzU) (3) sounds like a research project - @vasco-santos had some thoughts in the doc on why this might be hard (https://docs.google.com/document/d/1jK-FwpO_dWVx2oCHLwJT67yvZM6ZnkfTKWMb2LqE-gM/edit?disco=AAABK3hDq6E), so we need to spend some time prototyping and then iterate based on what we find out (4) also a research project - need to figure out if all the signals @prodalex identified in https://docs.google.com/document/d/1jK-FwpO_dWVx2oCHLwJT67yvZM6ZnkfTKWMb2LqE-gM/edit?disco=AAABK3hDqzI are already available in Athena and which we'd need to add, then come up with a plan to add anything we need (5) should mostly be non-engineering work, but worth planning for some amount of support time here

prodalex commented 2 months ago

I would say the prio would be:

  1. Fix null value issues in Athena
  2. old web3.storage once-off report
  3. set up a weekly job that pulls data from Stripe and dumps it to Athena
  4. prototype "active user" report and plan further iteration
  5. support @prodalex, @MF416 and @heyjay44 in putting together the various Athena queries and Quicksight dashboards

And i think before we get to the badbits stuff, we should prioritize egress before that. We could only plan a spike to figure out how to do the badbits for now.

Just one quick question @travis : Where in the tasks above would be the #requests (read, write operations) covered?

MF416 commented 2 months ago

Quick note - I don't think we need weekly stripe reports given we chart customers monthly? Whatever @alanshaw is doing right now with his monthly cadence is fine from an output perspective (understanding there are probably improvements to make Alan's life easier)

travis commented 1 month ago

ok - null values are sorted, the rest of the tasks from @prodalex should be scheduled soon!

Where in the tasks above would be the #requests (read, write operations) covered?

@prodalex - probably need to break that into its own task - "successful write operations" is an Athena query (happy to help formulate that if you'd like, just let me know) and "successful read operations" is probably some sort of log query in Cloudflare? I'm not actually sure the best way to get that number - we might already have it in https://daghouse.grafana.net somewhere but if not it will likely be a bit of work to get it...

reidlw commented 1 month ago

Closing this parent as we've created new child tasks that are in the backlog and can be treated independently