RMI-PACTA / pacta.data.preparation

The goal of {pacta.data.preparation} is to prepare and format all input datasets required to run the PACTA for investors tools.
https://rmi-pacta.github.io/pacta.data.preparation/
Other
1 stars 0 forks source link

don't expand `scenario_geography` and `equity_market` until necessary #11

Open cjyetman opened 1 year ago

cjyetman commented 1 year ago

https://github.com/RMI-PACTA/pacta.data.preparation/blob/ba0f8b8518afb2d00bfe5d9bff1a935418eaa5dd/R/dataprep_abcd_scen_connection.R#L143-L151

Up until merging in the scenario data, the expansion of the data with the scenario_geography and equity_market columns drastically multiplies the number of rows in the data, and the grouped calculations necessitated by these otherwise duplicated rows is a source of the incredibly long run times. Basically for every combination of id, technology, and year we are multiplying the rows by every combination of scenario_geography and equity_market and calculating duplicate data for all of them.

We should carefully consider if this is actually necessary, and if not calculate as much and we can before expanding to the scenario_geography and equity_market values. @jdhoffa @jacobvjk @AlexAxthelm

related RMI-PACTA/pacta.data.preparation#7