facebookexperimental / Robyn

Robyn is an experimental, AI/ML-powered and open sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. Our mission is to democratise modeling knowledge, inspire the industry through innovation, reduce human bias in the modeling process & build a strong open source marketing science community.
https://facebookexperimental.github.io/Robyn/
MIT License
1.16k stars 344 forks source link

[fix] plot_data_collect on paretoresults #1105

Closed alxlyj closed 2 weeks ago

alxlyj commented 3 weeks ago

Project Robyn

Summary

We were getting some errors mapping plot_data_collect accurately to our Python data classes.

The original implementation was flattening the R plot data structure, causing loss of hierarchy and making it difficult to access plot-specific data. This was evident from the different data structure we observed:

R Original Structure (Hierarchical):

plotDataCollect[[model_id]] <- list(
    plot1data = list(
        plotMediaShareLoopBar = data,
        plotMediaShareLoopLine = data,
        ySecScale = value
    ),
    plot2data = list(...),
    plot3data = list(...),
    # ... and so on
)

Python Before and After:

### Before:
converted_data = {
    'plotMediaShareLoopBar': pd.DataFrame(...),
    'plotMediaShareLoopLine': pd.DataFrame(...)
}

After:

converted_data = {
    'model_id': {
        'plot1data': {
            'plotMediaShareLoopBar': pd.DataFrame(...),
            'plotMediaShareLoopLine': pd.DataFrame(...),
            'ySecScale': value
        },
        'plot2data': {
            'plotWaterfallLoop': pd.DataFrame(...)
        }
        # ... other plot types
    }
    # ... other models
}

In this PR, we:

  1. Modified _convert_plot_data function to maintain R's hierarchical structure
  2. Added proper type handling for different data components
  3. Preserved the plot type categories (plot1data through plot7data)

See data_mapper.ipynb for usage examples.

Test Plan

image