Open agoose77 opened 10 months ago
Attention: 17 lines
in your changes are missing coverage. Please review.
Comparison is base (
d3b6208
) 93.10% compared to head (c62796d
) 92.72%.
Files | Patch % | Lines |
---|---|---|
src/dask_awkward/lib/operations.py | 73.01% | 17 Missing :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Is this affected, or even made easier, by scikit-hep/awkward#3043 (and subsequent changes to dask-awkward to defensively copy reports instead of changing them in place throughout a DAG)?
Once the reports become effectively immutable from dask-awkward's point of view—that is, each DAG node is associated with an unchanging report—then we won't need to have a hierarchical structure, nesting reports, since we could make the output of dak.concatenate
propagate all the reports from its inputs.
I came to this from #456, since @gordonwatts hit the same issue again and I was checking up on it.
I think this will become slightly simpler, but much of the logic will remain. The feature required by result = ak.concatenate((part1, part2, ..., partN))
is that touching result.pt.data
also touches part1.pt.data
, part2.pt.data
, ..., partN.pt.data
. I think we'll need some kind of hierarchy. We could encode that as a hierarchical report or with a parentage attribute in the report.
Fixes #456
It's clear from #456 that the complexity around
axis=0
concatenation makes it hard to reason about!The bug is caused by the loss of
report
objects associated with secondary arrays in the concatenate call; right now we take the first array (because all arrays are coerced to the same form) as the "meta".To fix this, we need to build a "parent" meta such that
parent.touch_data("key-1")
invokeschild1.touch_data("key-1-child")
,child2.touch_data("key-2-child")
, and so on.I'll try and clean this up, probably with an Awkward fix, in the near future.