Closed lgray closed 9 months ago
Example of multi-fill syntax:
axes_fill_info_dict = {
dense_axis_name : dense_variables_array_with_cuts["lep_chan_lst"][sr_cat][dense_axis_name],
"weight" : tuple(masked_weights),
"process" : histAxisName,
"category" : sr_cat,
"systematic" : tuple(wgt_var_lst),
}
hout[dense_axis_name].fill(**axes_fill_info_dict)
Here showing a fill where we pass multiple weights corresponding to systematic variations. This takes a taskgraph that was ending with 6GB memory usage (per dataset) and brings it to O(1GB), similarly the time to build the task graph is significantly reduced. ~1600 fill calls down from ~41k, many fewer layers, etc.
Multifill moved to #126, which supersedes this PR
This is mostly just logging for posterity, since it shows there is at least one solution to the issue. I'll get the problematic code to @martindurant as well so that we can properly characterize it.
So far:
staged
layerThis appears to have some nice scaling benefits, but we are figuring out why.
Largely posting this PR to demonstrate what solves memory and task-graph problems when approaching ~O(50k) fills. Not a real solution yet.