Closed david26694 closed 1 month ago
some notes:
class Metric:
alias: str
components: Tuple[str, str]
class Variant:
name: str
is_control: bool
class HypothesisTest:
metric: Metric
analysis: Analysis
analysis_config: dict
class AnalysisPlan:
tests: List[HypothesisTest]
variants: List[Variant]
def analyze(self, data, pre_exp_df, alpha=0.05) -> AnalysisResults:
# do the analysis
class AnalysisResults:
def __str__(self, metadata: dict) -> str:
# some representation method could have metadata as input
def to_dataframe(self) -> pd.DataFrame:
# return the results as a dataframe
def __add__(self, other: AnalysisResults) -> AnalysisResults:
# combine the results
analysis_plan_config = {
"tests": [
{
"metric": {
"alias": "metric_1",
"components": ("component_1", "component_2")
},
"analysis": {
"name": "analysis_1",
"config": {
"param_1": 1,
"param_2": 2
}
}
},
{
"metric": {
"alias": "metric_2",
"components": ("component_3", "component_4")
},
"analysis": {
"name": "analysis_1",
"config": {
"param_1": 1,
"param_2": 2
}
}
}
],
"variants": [
{
"name": "variant_1",
"is_control": True
},
{
"name": "variant_2",
"is_control": False
}
]
}
Slicing implementation:
class Metric:
alias: str
components: Tuple[str, str]
class Variant:
name: str
is_control: bool
class Dimension:
slice_col: str
slice_values: List[str]
class HypothesisTest:
metric: Metric
analysis: Analysis
analysis_config: dict
dimensions: List[Dimension]
def slice(self):
for dimension in self.dimensions:
for value in dimension.slice_values:
# slice the data, run the analysis
class AnalysisPlan:
tests: List[HypothesisTest]
variants: List[Variant]
def analyze(self, data, pre_exp_df, alpha=0.05) -> AnalysisResults:
# do the analysis
class AnalysisResults:
def __str__(self, metadata: dict) -> str:
# some representation method could have metadata as input
def to_dataframe(self) -> pd.DataFrame:
# return the results as a dataframe
def __add__(self, other: AnalysisResults) -> AnalysisResults:
# combine the results
Analysis implementation by @ludovico-lanni :
First correction:
Code design proposed by Ludo:
I think overall we can drop the experiment class, and just have the metadata as a dict in the AnalysisPlan. We could even rename AnalysisPlan to Experiment, I like it more.
The issue of this is that it allows metadata to be different accross different analysis plans. One workaround is
If we want to create stuff from config:
Alternatively, we could add variants in each test:
This way the dict looks bigger, we need to repeat code. But there are less classes