Closed jshorish closed 10 months ago
I believe this is all handled in post processing already. I map every parameter into the results data frame by adding param_ before them
Just to add to this as a running list of compatibility requirements, also required for the sensitivity analysis workflow is:
df
; anddf
.Both of these lists are used to pick out the right control (sweep) parameters and KPIs to be used for the threshold inequalities in the sensitivity analysis exercise.
[If there's a location in the codebase where these column labels are already defined then that's great, and this can even be exploited in the future to automatically pull them from that 'single source of truth' into the sensitivity analysis workflow.]
The sensitivity analysis / feature importance library cadCAD_machine_search requires that the simulation results DataFrame, hereafter called
df
, be in a particular post-processed state before threshold KPIs can be measured. This issue is just a placeholder for the operations to achieve this state (and can be ignored if these are already in place):Removal of substeps (often standard, repeated here to be exhaustive)
These lines just remove the intermediate PSUB substeps from
df
.Addition of control parameter constellations as columns
Each simulation output row should have one column per control parameter, containing the value of the parameter from the parameter 'constellation' vector for that row. This may be already handled by the simulation workflow. If not, one way to add them in post-processing is:
configs
object from the cadCAD simulation for thisdf
(this is usually initialized viafrom cadCAD import configs
). This contains, for each runi = 0, 1, ...
inconfigs[i]
, all of the parameter constellation information in thesim_config
dictionary attribute;df
:The result is that
df
has new columns labeled by the control parameter labels, with associated parameter values for each control parameter that corresponds to the runs of each row.