Closed GenevieveBuckley closed 3 years ago
Since this is a relatively uncontroversial summary of previous discussions, I'm going to go ahead and merge this.
cc @jpivarski (in case this is of interest to you and/or your community)
Thanks for pointing me to it! This seems to be a "technical raggedness," though—the functions that generate the data return different length outputs, but they are to be logically viewed as a concatenated array. (Looks like a good solution to that, too.)
The HEP community, and presumably others, often have to deal with data whose meaning is ragged: the data collection includes 3 of these, 5 of those, 4 of something else, etc., and they should not be viewed as a concatenated collection because that would lose information about what it is one wants to model. So that's a different topic, and hopefully we'll be talking more about Dask-based solutions for that in the nearish future.
Thanks again for the heads-up!
I thought I'd write up some of the key takeaways from discussions we've been having about overlapping array chunks producing ragged outputs.
There didn't appear to be one obviously preferred way to do this, so I'm hoping this post will make the preferred path a little bit clearer for people doing similar work. (Originally I had some other stuff mixed in to this post, but I think that muddied the message too much).
Related discussions: