Open ryscheng opened 1 week ago
Yes! I call these "point-in-time" metrics here #2275.
We should also consider adding a sample_date
field that represents the last date in the relevant event data.
For example, from this query we can see a lot of issues with the time intervals currently used in our models:
select
m.metric_name,
max(t.amount),
min(t.sample_date) as first_date,
max(t.sample_date) as last_date
from metrics.timeseries_metrics_by_project_v0 as t
join metrics.metrics_v0 as m
on t.metric_id = m.metric_id
join default.projects_v1 as p
on t.project_id = p.project_id
where p.project_name = 'uniswap'
group by 1
order by 4
I think the reason I'm suggesting a new model is because we might want to pre-compute the aggregate (total number of contributors, total number of commits, total number of stars) etc. This is different from "point-in-time" metrics, because we want to aggregate over all time.
But yes, I agree we should store the date that we compute the aggregate on.
Summary from the conversation today.
Plan is to implement this as a sqlmesh model manually outside of the metrics factory. It should be just a variation of related models from timeseries metrics.
In the future we can file issues for automatically generating this from the metrics factory, but probably not worth it yet.
What is it?
Currently we have these legacy
code_metrics_by_project
, andonchain_metrics_by_project
computed in dbt.With the new metrics factory on sqlmesh, we should create new models. Here's my proposal: We just create 3 marts:
key_metrics_by_artifact
key_metrics_by_project
key_metrics_by_collection
The schema for each of these should look like this:
metric_id
artifact_id
,project_id
, orcollection_id
amount
unit
This is not a time series metric. Instead, each metric_id has a custom way to aggregate all data. For example for commits it might be a sum. For first_commit, it might be just grabbing the earliest date.
This can power the new MetricPage https://github.com/opensource-observer/oso/issues/27 as well as the table in the CollectionPage
After this exists, we can deprecate the older models on dbt.