Open ryscheng-mobile opened 9 months ago
Suggested steps: Suggest:
Question for @ravenac95
@ccerv1 and I were just talking about this one, and I think we need some help with the metrics rolling window factory to support it. I think there are actually 3 rolling windows at play here:
classification
rolling window (e.g. a developer needs to have events in 10 of 30 days to be considered fulltime)counting
rolling window (e.g. we want to know how many active developers there were in the last 6 months)comparison
rolling window (e.g. across the last 2x 6-month periods --- how many users went from part-time to full-time, or part-time to churned, etc)I think right now we only assume a single rolling window, is that correct?
ohhhh ya interesting, we do currently assume 1, but ya I'll need to think how we can combine things so we can depend on some of these other rolling windows. This seems to be rolling window queries on rolling windows.
This changes how I was thinking of things because I was trying to constrain the collection/project automatic creation a bit. Let me think on this!
Actually so what i was thinking in terms of changes was to do something like this:
timeseries_metrics(
model_prefix="timeseries",
metric_queries={
# This will automatically generate star counts for the given roll up periods.
# A rollup is just a simple addition of the aggregation. So basically we
# calculate the daily rollup every day by getting the count of the day.
# Then the weekly every week by getting the count of the week and
# monthly by getting the count of the month.
# Additionally this will also create this along the dimensions (entity_types) of
# project/collection so the resulting models will be named as follows
# `metrics.timeseries_stars_to_{entity_type}_{rollup}`
"stars": MetricQueryDef(
ref="stars.sql",
rollups=["daily", "weekly", "monthly"],
entity_types=["artifact", "project", "collection"], # This is the default value
),
# This defines something with a rolling option that allows you to look back
# to some arbitrary window. So you specify the window and specify the unit.
# The unit and the window are used to pass in variables to the query. So it's
# up to the query to actually query the correct window.
# The resultant models are named as such
# `metrics.timeseries_active_days_to_{entity_type}_over_{window}_{unit}`
"active_days": MetricQueryDef(
ref="active_days.sql",
rolling={
"windows": [30, 60, 90],
"unit": "day",
"cron": "0 0 1 */6 *", # This determines how often this is calculated
}
),
},
default_dialect="clickhouse",
)
I think this setup should give us the flexibility to be able to do the window of windows without having to build much additional craziness i think?
What is it?
The ability for a project’s ecosystem to understand, in detail, how new users are entering and exiting their open source community / user dependency graph.