legend-exp / legend-pydataobj

LEGEND Python Data Objects
https://legend-pydataobj.readthedocs.io
GNU General Public License v3.0
1 stars 9 forks source link

Regression: broadcasting error in `_h5_read_ndarray` #111

Open gipert opened 2 weeks ago

gipert commented 2 weeks ago

Apparent regression seen with v1.10 at NERSC, compared to v1.7:

Traceback (most recent call last):
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/workflow/rules/../scripts/pars_dsp_eopt.py", line 117, in <module>
    tb_data = sto.read(f"{args.channel}/raw", args.peak_file, idx=ids)[0]
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install/lib/python3.12/site-packages/lgdo/lh5/store.py", line 231, in read
    return _serializers._h5_read_lgdo(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install/lib/python3.12/site-packages/lgdo/lh5/_serializers/read/composite.py", line 117, in _h5_read_lgdo
    return _h5_read_table(
           ^^^^^^^^^^^^^^^
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install/lib/python3.12/site-packages/lgdo/lh5/_serializers/read/composite.py", line 323, in _h5_read_table
    col_dict[field], n_rows_read = _h5_read_lgdo(
                                   ^^^^^^^^^^^^^^
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install/lib/python3.12/site-packages/lgdo/lh5/_serializers/read/composite.py", line 117, in _h5_read_lgdo
    return _h5_read_table(
           ^^^^^^^^^^^^^^^
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install/lib/python3.12/site-packages/lgdo/lh5/_serializers/read/composite.py", line 323, in _h5_read_table
    col_dict[field], n_rows_read = _h5_read_lgdo(
                                   ^^^^^^^^^^^^^^
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install/lib/python3.12/site-packages/lgdo/lh5/_serializers/read/composite.py", line 214, in _h5_read_lgdo
    return _h5_read_array(
           ^^^^^^^^^^^^^^^
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install/lib/python3.12/site-packages/lgdo/lh5/_serializers/read/array.py", line 26, in _h5_read_array
    return _h5_read_array_generic(Array, h5d, fname, oname, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install/lib/python3.12/site-packages/lgdo/lh5/_serializers/read/array.py", line 13, in _h5_read_array_generic
    nda, attrs, n_rows_to_read = _h5_read_ndarray(h5d, fname, oname, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install/lib/python3.12/site-packages/lgdo/lh5/_serializers/read/ndarray.py", line 107, in _h5_read_ndarray
    nda[:, ...] = tmp[idx, ...]
    ~~~^^^^^^^^
ValueError: could not broadcast input array from shape (3085,) into shape (15650,)
[Wed Oct 23 07:35:37 2024]
Error in rule build_pars_dsp_eopt:
    jobid: 1566
    input: /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/par/l200-p03-r000-cal-20230311T235840Z-ch1110403-par_dsp_peaks.lh5, /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/par/l200-p03-r000-cal-20230311T235840Z-ch1110403-par_dsp_dplms.json, /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/plt/l200-p03-r000-cal-20230311T235840Z-ch1110403-plt_dsp_dplms.pkl
    output: /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/par/l200-p03-r000-cal-20230311T235840Z-ch1110403-par_dsp_eopt.json, /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/par/l200-p03-r000-cal-20230311T235840Z-ch1110403-par_dsp_objects.pkl, /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/plt/l200-p03-r000-cal-20230311T235840Z-ch1110403-plt_dsp.pkl
    log: /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/log/pars_dsp_eopt/l200-p03-r000-cal-20230311T235840Z-ch1110403-pars_dsp_eopt.log (check log file(s) for error details)
    shell:
        shifter --env='PYTHONUSERBASE=/global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/software/python/install' --env="HDF5_USE_FILE_LOCKING=FALSE" --env="LGDO_BOUNDSCHECK=false" --env="DSPEED_BOUNDSCHECK=false" --env="PYGAMA_PARALLEL=false" --env="PYGAMA_FASTMATH=false" --image legendexp/legend-base:latest python3 -B /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/workflow/../scripts/pars_dsp_eopt.py --log /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/log/pars_dsp_eopt/l200-p03-r000-cal-20230311T235840Z-ch1110403-pars_dsp_eopt.log --configs /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/inputs/dataprod/config --datatype cal --timestamp 20230311T235840Z --channel ch1110403 --peak_file /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/par/l200-p03-r000-cal-20230311T235840Z-ch1110403-par_dsp_peaks.lh5 --inplots /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/plt/l200-p03-r000-cal-20230311T235840Z-ch1110403-plt_dsp_dplms.pkl --decay_const /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/par/l200-p03-r000-cal-20230311T235840Z-ch1110403-par_dsp_dplms.json --plot_path /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/plt/l200-p03-r000-cal-20230311T235840Z-ch1110403-plt_dsp.pkl --qbb_grid_path /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/par/l200-p03-r000-cal-20230311T235840Z-ch1110403-par_dsp_objects.pkl --final_dsp_pars /global/cfs/cdirs/m2676/users/pertoldi/legend-prodenv/prod-blind/_dev/v3.0.0/generated/tmp/par/l200-p03-r000-cal-20230311T235840Z-ch1110403-par_dsp_eopt.json
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Have no time to debug further, but downgrading to v1.7 seems to solve the issue.

iguinn commented 1 week ago

Oh, I see the problem. read is being passed an index mask as a boolean array, which I don't think is supposed to be supported. It worked in the old version because it somehow passed all the checks before actually applying the mask, but once the mask got applied h5py's high level interface handled it. With the low level interface, this would have to be handled explicity (actually the easier solution is probably just to convert the mask into an entry list).

gipert commented 1 week ago

@ggmarshall should this be changed on the pargen side or should make pydataobj support it?

iguinn commented 6 days ago

This PR has been accepted, but I'll leave the conversation "open" until we confirm that this is a feature we want to keep around long term (otherwise we will have to fix this in the dataflow and then revert the change)