iris-hep / func_adl_uproot

Uproot-based backend for FuncADL
MIT License
0 stars 0 forks source link

SelectMany fails to flatten simple 2D array #43

Closed gordonwatts closed 3 years ago

gordonwatts commented 3 years ago

I've been seeing this with the production version of uproot in ServiceX:

data = ServiceXSourceUpROOT(files['ggH125_ZZ4lep']['files'], files['ggH125_ZZ4lep']['treename'], backend='open_uproot') \
    .SelectMany("lambda e: {'JetPT': e['lep_pt']}") \
    .AsAwkwardArray() \
    .value()

And then I get back:

Error transforming file: root://eospublic.cern.ch//eos/opendata/atlas/OutreachDatasets/2020-01-22/4lep/MC/mc_345060.ggH125_ZZ4lep.4lep.root
  -> error: Failed to transform input file root://eospublic.cern.ch//eos/opendata/atlas/OutreachDatasets/2020-01-22/4lep/MC/mc_345060.ggH125_ZZ4lep.4lep.root: arrays of records cannot be flattened (but their contents can be; try a different 'axis')
  -> 
  -> (https://github.com/scikit-hep/awkward-1.0/blob/1.0.2/src/libawkward/array/RecordArray.cpp#L1043)
masonproffitt commented 3 years ago

I think this may be a bug in awkward. I'm going to make an issue there and see what Jim says.

masonproffitt commented 3 years ago

Two workarounds that I know of:

  1. Drop the dict; i.e., turn {'JetPT': e['lep_pt']} into just e['lep_pt']
  2. Zip the dict: Zip({'JetPT': e['lep_pt']})
masonproffitt commented 3 years ago

I think this may be a bug in awkward. I'm going to make an issue there and see what Jim says.

I take this back; I understand what the issue is now. This example that you have is a weird edge case because it doesn't generalize to dicts beyond a size of 1. That is, .SelectMany(lambda e: {'a': e.a, 'b': e.b}) couldn't work in general because a and b could have different shapes, and ak.Array can't handle fields with different outer (axis=0) lengths. The only way to support that case would be to allow the output to be something other than an ak.Array in certain cases, which would be a mess for anything downstream like ServiceX.

But it is possible to support single-item dicts in SelectMany, so I've added this in #45.