Closed alexander-held closed 4 months ago
Making this a target for week 4, this is going to matter more and more as we scale beyond a few thousand files as we will end up saving large amounts of time. Reasonably high priority at that stage.
I know a huge amount of work was done here. What is the status? Should we move this to week 5?
Not done as priorities have shifted a bit and the approach in #58 does not include a pre-processing step by design. When we move back to more Dask and coffea, this needs to still be done.
This hasn't been a priority as more work was done with uproot.open
, shifting to week 6.
Serialize to json, look into dataset tools in coffea: slice files / slice chunks, look into https://coffeateam.github.io/coffea/modules/coffea.dataset_tools.html