mozilla / mozanalysis

A library for Mozilla experiments analysis
https://mozilla.github.io/mozanalysis/
Mozilla Public License 2.0
9 stars 13 forks source link

Add loading for metrics & data sources defined in outcomes #199

Closed dzeber closed 6 months ago

dzeber commented 8 months ago

This will be useful when we want to use metrics or data sources defined in metric-hub outcomes in sizing or other ad-hoc calculations. The current workflow is to copy-paste the metric definition. ConfigLoader is now the main way that metric-hub configs are accessed by data scientists working in notebooks.

Other updates:

danielkberry commented 8 months ago

So I talked it over with @mikewilli and we think this is a situation where mozanalysis should mostly use metric-config-parser to resolve the metric (i.d., the metric-config-parser should be able to provide an answer to "give me the definition of this metric from this config", instead of mozanalysis having to assemble a definition itself).

Today, we're pretty close, we can just do something like (error checking omitted for brevity):

@dataclass
class EmptyConf:
    app_name: str

# ConfigLoader.get_outcome_metric    
def get_outcome_metric(self, metric_slug: str, outcome_slug: str, app_name: str):
    outcome_spec = self.jetstream_configs.spec_for_outcome(
        slug=outcome_slug, platform=app_name
    )

    metric_definition = outcome_spec.metrics.get(metric_slug)
    conf = EmptyConf(app_name)
    summaries = metric_definition.resolve(outcome_spec, conf, configs)
    return summaries[0].metric

where the MetricReference.resolve method does most of the heavy lifting. This is the same function that is called to resolve a metric definition in production jobs (for Jetstream & OpMon) so I think that should be the source of truth for the logic.

The only thing that's left to do would be to handle the conf parameter. Here I've mocked a conf-like object with just the parameter needed, but that's not the proper solution. Interested to get Mike's thoughts here.

dzeber commented 8 months ago

Sounds good to me, let me know how you'd like to proceed.

mikewilli commented 8 months ago

I don't think @danielkberry's EmptyConf is too far off. As noted, resolve only really cares about the app_name, so I wonder if the solution is to refactor the resolve function to just take app_name directly?

Another option, if you want all the metrics from an outcome, might be to add a resolve function to OutcomeSpec, though this would be more involved. Probably not worthwhile just for this when we're so close already with what already exists.

dzeber commented 8 months ago

@danielkberry @mikewilli (Sorry for the delay, I had to backburner this work the past week.) What do you think is the best way forward - would you want to make updates to metric-config-parser before landing this, or else we can use Daniel's approach and update later?

Another option, if you want all the metrics from an outcome

I agree this is probably not worthwhile at this point, as the use case I have in mind is to get specific metrics for sizing.

mikewilli commented 7 months ago

@dzeber In the interest of starting with the path of least resistance, why don't we proceed with @danielkberry's EmptyConf solution (maybe another name would be better.. AppOnlyConfiguration, MinimalConfiguration, ..? open to ideas, but also it's not a huge issue).

This will allow you to use the existing resolve to retrieve and use definitions for outcome metrics without requiring any external updates (i.e., to metric-config-parser). And I am concerned that updating the resolve function signature to only take app_name would require a bunch of downstream changes for consumers of metric-config-parser, so if we're going to make changes there I'd rather do so after more consideration.

Let me know if you have any trouble integrating the code @danielkberry shared into this PR, happy to help.

dzeber commented 7 months ago

Sounds good - I agree that making the upstream changes to metric-config-parser broadens the scope of this change considerably. I'll integrate @scholtzan 's update from #201 as well

codecov-commenter commented 7 months ago

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (c69380a) 76.31% compared to head (d5f0ad0) 77.23%. Report is 1 commits behind head on main.

Files Patch % Lines
src/mozanalysis/config.py 94.59% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #199 +/- ## ========================================== + Coverage 76.31% 77.23% +0.92% ========================================== Files 23 23 Lines 992 1028 +36 ========================================== + Hits 757 794 +37 + Misses 235 234 -1 ``` | [Flag](https://app.codecov.io/gh/mozilla/mozanalysis/pull/199/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla) | Coverage Δ | | |---|---|---| | [py38](https://app.codecov.io/gh/mozilla/mozanalysis/pull/199/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla) | `77.23% <94.59%> (+0.92%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dzeber commented 7 months ago

Quick update: I won't have time to finish this up before holidays but I'll pick it up when I get back in Jan.

dzeber commented 6 months ago

I've updated the outcome metric loading using @danielkberry 's approach.

I also fixed a typo where description wasn't getting set in get_metric.