Closed Aske-Rosted closed 1 year ago
What is the error message/log you receive?
Running the read dataset example but using my own parquet dataset.
graphnet: INFO 2022-10-28 16:40:37 - get_logger - Writing log to logs/graphnet_20221028-164037.log
graphnet: INFO 2022-10-28 16:40:37 - main - Available columns in SRTInIcePulses
graphnet: INFO 2022-10-28 16:40:37 - main - . charge
graphnet: INFO 2022-10-28 16:40:37 - main - . flags
graphnet: INFO 2022-10-28 16:40:37 - main - . time
graphnet: INFO 2022-10-28 16:40:37 - main - . width
graphnet: INFO 2022-10-28 16:40:37 - main - . area
graphnet: INFO 2022-10-28 16:40:37 - main - . directionazimuth
graphnet: INFO 2022-10-28 16:40:37 - main - . directionphi
graphnet: INFO 2022-10-28 16:40:37 - main - . directiontheta
graphnet: INFO 2022-10-28 16:40:37 - main - . directionx
graphnet: INFO 2022-10-28 16:40:37 - main - . directiony
graphnet: INFO 2022-10-28 16:40:37 - main - . directionz
graphnet: INFO 2022-10-28 16:40:37 - main - . directionzenith
graphnet: INFO 2022-10-28 16:40:37 - main - . positionmag2
graphnet: INFO 2022-10-28 16:40:37 - main - . positionmagnitude
graphnet: INFO 2022-10-28 16:40:37 - main - . positionphi
graphnet: INFO 2022-10-28 16:40:37 - main - . positionr
graphnet: INFO 2022-10-28 16:40:37 - main - . positionrho
graphnet: INFO 2022-10-28 16:40:37 - main - . positiontheta
graphnet: INFO 2022-10-28 16:40:37 - main - . positionx
graphnet: INFO 2022-10-28 16:40:37 - main - . positiony
graphnet: INFO 2022-10-28 16:40:37 - main - . positionz
graphnet: INFO 2022-10-28 16:40:37 - main - . position_list
graphnet: INFO 2022-10-28 16:40:37 - main - . atwd_beacon_baseline__parent
graphnet: INFO 2022-10-28 16:40:37 - main - . atwd_bin_calib_slopeparent
graphnet: INFO 2022-10-28 16:40:37 - main - . atwd_delta_tparent
graphnet: INFO 2022-10-28 16:40:37 - main - . atwd_freq_fitparent
graphnet: INFO 2022-10-28 16:40:37 - main - . atwd_gainparent
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributioncompensation_factor
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributionexp1_amp
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributionexp1_width
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributionexp2_amp
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributionexp2_width
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributiongaus_amp
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributiongaus_mean
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributiongaus_width
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributionis_valid
graphnet: INFO 2022-10-28 16:40:37 - main - . combined_spe_charge_distributionslc_gaus_mean
graphnet: INFO 2022-10-28 16:40:37 - main - . dom_cal_version
graphnet: INFO 2022-10-28 16:40:37 - main - . dom_noise_decay_rate
graphnet: INFO 2022-10-28 16:40:37 - main - . dom_noise_rate
graphnet: INFO 2022-10-28 16:40:37 - main - . dom_noise_scintillation_hits
graphnet: INFO 2022-10-28 16:40:37 - main - . dom_noise_scintillation_mean
graphnet: INFO 2022-10-28 16:40:37 - main - . dom_noise_scintillation_sigma
graphnet: INFO 2022-10-28 16:40:37 - main - . dom_noise_thermal_rate
graphnet: INFO 2022-10-28 16:40:37 - main - . fadc_baseline_fitintercept
graphnet: INFO 2022-10-28 16:40:37 - main - . fadc_baseline_fitslope
graphnet: INFO 2022-10-28 16:40:37 - main - . fadc_beacon_baseline
graphnet: INFO 2022-10-28 16:40:37 - main - . fadc_delta_t
graphnet: INFO 2022-10-28 16:40:37 - main - . fadc_gain
graphnet: INFO 2022-10-28 16:40:37 - main - . front_end_impedance
graphnet: INFO 2022-10-28 16:40:37 - main - . hv_gain_fitintercept
graphnet: INFO 2022-10-28 16:40:37 - main - . hv_gain_fitslope
graphnet: INFO 2022-10-28 16:40:37 - main - . is_mean_atwd_charge_valid
graphnet: INFO 2022-10-28 16:40:37 - main - . is_mean_fadc_charge_valid
graphnet: INFO 2022-10-28 16:40:37 - main - . mean_atwd_charge
graphnet: INFO 2022-10-28 16:40:37 - main - . mean_fadc_charge
graphnet: INFO 2022-10-28 16:40:37 - main - . mpe_disc_calibintercept
graphnet: INFO 2022-10-28 16:40:37 - main - . mpe_disc_calibslope
graphnet: INFO 2022-10-28 16:40:37 - main - . pmt_disc_calibintercept
graphnet: INFO 2022-10-28 16:40:37 - main - . pmt_disc_calibslope
graphnet: INFO 2022-10-28 16:40:37 - main - . relative_dom_eff
graphnet: INFO 2022-10-28 16:40:37 - main - . spe_disc_calibintercept
graphnet: INFO 2022-10-28 16:40:37 - main - . spe_disc_calibslope
graphnet: INFO 2022-10-28 16:40:37 - main - . tau_parametersp0
graphnet: INFO 2022-10-28 16:40:37 - main - . tau_parametersp1
graphnet: INFO 2022-10-28 16:40:37 - main - . tau_parameters__p2
graphnet: INFO 2022-10-28 16:40:37 - main - . tau_parametersp3
graphnet: INFO 2022-10-28 16:40:37 - main - . tau_parametersp4
graphnet: INFO 2022-10-28 16:40:37 - main - . tau_parameters__p5
graphnet: INFO 2022-10-28 16:40:37 - main - . tau_parameterstau_frac
graphnet: INFO 2022-10-28 16:40:37 - main - . temperature
graphnet: INFO 2022-10-28 16:40:37 - main - . transit_timeintercept
graphnet: INFO 2022-10-28 16:40:37 - main - . transittimeslope
graphnet: INFO 2022-10-28 16:40:37 - main - . indexom
graphnet: INFO 2022-10-28 16:40:37 - main - . indexpmt
graphnet: INFO 2022-10-28 16:40:37 - main - . indexstring
graphnet: INFO 2022-10-28 16:40:37 - main - . indexlist
graphnet: INFO 2022-10-28 16:40:37 - main - . event_no
graphnet: INFO 2022-10-28 16:40:37 - main - Available columns in truth
graphnet: INFO 2022-10-28 16:40:37 - main - . energy
graphnet: INFO 2022-10-28 16:40:37 - main - . position_x
graphnet: INFO 2022-10-28 16:40:37 - main - . position_y
graphnet: INFO 2022-10-28 16:40:37 - main - . position_z
graphnet: INFO 2022-10-28 16:40:37 - main - . azimuth
graphnet: INFO 2022-10-28 16:40:37 - main - . zenith
graphnet: INFO 2022-10-28 16:40:37 - main - . pid
graphnet: INFO 2022-10-28 16:40:37 - main - . event_time
graphnet: INFO 2022-10-28 16:40:37 - main - . sim_type
graphnet: INFO 2022-10-28 16:40:37 - main - . interaction_type
graphnet: INFO 2022-10-28 16:40:37 - main - . elasticity
graphnet: INFO 2022-10-28 16:40:37 - main - . RunID
graphnet: INFO 2022-10-28 16:40:37 - main - . SubrunID
graphnet: INFO 2022-10-28 16:40:37 - main - . EventID
graphnet: INFO 2022-10-28 16:40:37 - main - . SubEventID
graphnet: INFO 2022-10-28 16:40:37 - main - . dbang_decay_length
graphnet: INFO 2022-10-28 16:40:37 - main - . track_length
graphnet: INFO 2022-10-28 16:40:37 - main - . stopped_muon
graphnet: INFO 2022-10-28 16:40:37 - main - . energy_track
graphnet: INFO 2022-10-28 16:40:37 - main - . inelasticity
graphnet: INFO 2022-10-28 16:40:37 - main - . DeepCoreFilter_13
graphnet: INFO 2022-10-28 16:40:37 - main - . CascadeFilter_13
graphnet: INFO 2022-10-28 16:40:37 - main - . MuonFilter_13
graphnet: INFO 2022-10-28 16:40:37 - main - . OnlineL2Filter_17
graphnet: INFO 2022-10-28 16:40:37 - main - . L3_oscNext_bool
graphnet: INFO 2022-10-28 16:40:37 - main - . L4_oscNext_bool
graphnet: INFO 2022-10-28 16:40:37 - main - . L5_oscNext_bool
graphnet: INFO 2022-10-28 16:40:37 - main - . L6_oscNext_bool
graphnet: INFO 2022-10-28 16:40:37 - main - . L7_oscNext_bool
graphnet: INFO 2022-10-28 16:40:37 - main - . event_no
Traceback (most recent call last):
File "/disk20/users/aske/graphnet/personal_scripts/read_dataset.py", line 110, in
Alright, I see your point. Looks like the Parquet-data reading is going about this backwards. We should probably do something like https://github.com/graphnet-team/graphnet/blob/main/src/graphnet/data/sqlite/sqlite_dataset.py#L55 instead
Did try something similar but that gives issues down the line beceause then you have index be equal to the actual indexing (what I think is the event number), leading to the following error.
graphnet: INFO 2022-10-28 16:50:34 - get_logger - Writing log to logs/graphnet_20221028-165034.log
graphnet: INFO 2022-10-28 16:50:35 - main - Available columns in SRTInIcePulses
graphnet: INFO 2022-10-28 16:50:35 - main - . charge
graphnet: INFO 2022-10-28 16:50:35 - main - . flags
graphnet: INFO 2022-10-28 16:50:35 - main - . time
graphnet: INFO 2022-10-28 16:50:35 - main - . width
graphnet: INFO 2022-10-28 16:50:35 - main - . area
graphnet: INFO 2022-10-28 16:50:35 - main - . directionazimuth
graphnet: INFO 2022-10-28 16:50:35 - main - . directionphi
graphnet: INFO 2022-10-28 16:50:35 - main - . directiontheta
graphnet: INFO 2022-10-28 16:50:35 - main - . directionx
graphnet: INFO 2022-10-28 16:50:35 - main - . directiony
graphnet: INFO 2022-10-28 16:50:35 - main - . directionz
graphnet: INFO 2022-10-28 16:50:35 - main - . directionzenith
graphnet: INFO 2022-10-28 16:50:35 - main - . positionmag2
graphnet: INFO 2022-10-28 16:50:35 - main - . positionmagnitude
graphnet: INFO 2022-10-28 16:50:35 - main - . positionphi
graphnet: INFO 2022-10-28 16:50:35 - main - . positionr
graphnet: INFO 2022-10-28 16:50:35 - main - . positionrho
graphnet: INFO 2022-10-28 16:50:35 - main - . positiontheta
graphnet: INFO 2022-10-28 16:50:35 - main - . positionx
graphnet: INFO 2022-10-28 16:50:35 - main - . positiony
graphnet: INFO 2022-10-28 16:50:35 - main - . positionz
graphnet: INFO 2022-10-28 16:50:35 - main - . position_list
graphnet: INFO 2022-10-28 16:50:35 - main - . atwd_beacon_baseline__parent
graphnet: INFO 2022-10-28 16:50:35 - main - . atwd_bin_calib_slopeparent
graphnet: INFO 2022-10-28 16:50:35 - main - . atwd_delta_tparent
graphnet: INFO 2022-10-28 16:50:35 - main - . atwd_freq_fitparent
graphnet: INFO 2022-10-28 16:50:35 - main - . atwd_gainparent
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributioncompensation_factor
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributionexp1_amp
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributionexp1_width
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributionexp2_amp
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributionexp2_width
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributiongaus_amp
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributiongaus_mean
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributiongaus_width
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributionis_valid
graphnet: INFO 2022-10-28 16:50:35 - main - . combined_spe_charge_distributionslc_gaus_mean
graphnet: INFO 2022-10-28 16:50:35 - main - . dom_cal_version
graphnet: INFO 2022-10-28 16:50:35 - main - . dom_noise_decay_rate
graphnet: INFO 2022-10-28 16:50:35 - main - . dom_noise_rate
graphnet: INFO 2022-10-28 16:50:35 - main - . dom_noise_scintillation_hits
graphnet: INFO 2022-10-28 16:50:35 - main - . dom_noise_scintillation_mean
graphnet: INFO 2022-10-28 16:50:35 - main - . dom_noise_scintillation_sigma
graphnet: INFO 2022-10-28 16:50:35 - main - . dom_noise_thermal_rate
graphnet: INFO 2022-10-28 16:50:35 - main - . fadc_baseline_fitintercept
graphnet: INFO 2022-10-28 16:50:35 - main - . fadc_baseline_fitslope
graphnet: INFO 2022-10-28 16:50:35 - main - . fadc_beacon_baseline
graphnet: INFO 2022-10-28 16:50:35 - main - . fadc_delta_t
graphnet: INFO 2022-10-28 16:50:35 - main - . fadc_gain
graphnet: INFO 2022-10-28 16:50:35 - main - . front_end_impedance
graphnet: INFO 2022-10-28 16:50:35 - main - . hv_gain_fitintercept
graphnet: INFO 2022-10-28 16:50:35 - main - . hv_gain_fitslope
graphnet: INFO 2022-10-28 16:50:35 - main - . is_mean_atwd_charge_valid
graphnet: INFO 2022-10-28 16:50:35 - main - . is_mean_fadc_charge_valid
graphnet: INFO 2022-10-28 16:50:35 - main - . mean_atwd_charge
graphnet: INFO 2022-10-28 16:50:35 - main - . mean_fadc_charge
graphnet: INFO 2022-10-28 16:50:35 - main - . mpe_disc_calibintercept
graphnet: INFO 2022-10-28 16:50:35 - main - . mpe_disc_calibslope
graphnet: INFO 2022-10-28 16:50:35 - main - . pmt_disc_calibintercept
graphnet: INFO 2022-10-28 16:50:35 - main - . pmt_disc_calibslope
graphnet: INFO 2022-10-28 16:50:35 - main - . relative_dom_eff
graphnet: INFO 2022-10-28 16:50:35 - main - . spe_disc_calibintercept
graphnet: INFO 2022-10-28 16:50:35 - main - . spe_disc_calibslope
graphnet: INFO 2022-10-28 16:50:35 - main - . tau_parametersp0
graphnet: INFO 2022-10-28 16:50:35 - main - . tau_parametersp1
graphnet: INFO 2022-10-28 16:50:35 - main - . tau_parameters__p2
graphnet: INFO 2022-10-28 16:50:35 - main - . tau_parametersp3
graphnet: INFO 2022-10-28 16:50:35 - main - . tau_parametersp4
graphnet: INFO 2022-10-28 16:50:35 - main - . tau_parameters__p5
graphnet: INFO 2022-10-28 16:50:35 - main - . tau_parameterstau_frac
graphnet: INFO 2022-10-28 16:50:35 - main - . temperature
graphnet: INFO 2022-10-28 16:50:35 - main - . transit_timeintercept
graphnet: INFO 2022-10-28 16:50:35 - main - . transittimeslope
graphnet: INFO 2022-10-28 16:50:35 - main - . indexom
graphnet: INFO 2022-10-28 16:50:35 - main - . indexpmt
graphnet: INFO 2022-10-28 16:50:35 - main - . indexstring
graphnet: INFO 2022-10-28 16:50:35 - main - . indexlist
graphnet: INFO 2022-10-28 16:50:35 - main - . event_no
graphnet: INFO 2022-10-28 16:50:35 - main - Available columns in truth
graphnet: INFO 2022-10-28 16:50:35 - main - . energy
graphnet: INFO 2022-10-28 16:50:35 - main - . position_x
graphnet: INFO 2022-10-28 16:50:35 - main - . position_y
graphnet: INFO 2022-10-28 16:50:35 - main - . position_z
graphnet: INFO 2022-10-28 16:50:35 - main - . azimuth
graphnet: INFO 2022-10-28 16:50:35 - main - . zenith
graphnet: INFO 2022-10-28 16:50:35 - main - . pid
graphnet: INFO 2022-10-28 16:50:35 - main - . event_time
graphnet: INFO 2022-10-28 16:50:35 - main - . sim_type
graphnet: INFO 2022-10-28 16:50:35 - main - . interaction_type
graphnet: INFO 2022-10-28 16:50:35 - main - . elasticity
graphnet: INFO 2022-10-28 16:50:35 - main - . RunID
graphnet: INFO 2022-10-28 16:50:35 - main - . SubrunID
graphnet: INFO 2022-10-28 16:50:35 - main - . EventID
graphnet: INFO 2022-10-28 16:50:35 - main - . SubEventID
graphnet: INFO 2022-10-28 16:50:35 - main - . dbang_decay_length
graphnet: INFO 2022-10-28 16:50:35 - main - . track_length
graphnet: INFO 2022-10-28 16:50:35 - main - . stopped_muon
graphnet: INFO 2022-10-28 16:50:35 - main - . energy_track
graphnet: INFO 2022-10-28 16:50:35 - main - . inelasticity
graphnet: INFO 2022-10-28 16:50:35 - main - . DeepCoreFilter_13
graphnet: INFO 2022-10-28 16:50:35 - main - . CascadeFilter_13
graphnet: INFO 2022-10-28 16:50:35 - main - . MuonFilter_13
graphnet: INFO 2022-10-28 16:50:35 - main - . OnlineL2Filter_17
graphnet: INFO 2022-10-28 16:50:35 - main - . L3_oscNext_bool
graphnet: INFO 2022-10-28 16:50:35 - main - . L4_oscNext_bool
graphnet: INFO 2022-10-28 16:50:35 - main - . L5_oscNext_bool
graphnet: INFO 2022-10-28 16:50:35 - main - . L6_oscNext_bool
graphnet: INFO 2022-10-28 16:50:35 - main - . L7_oscNext_bool
graphnet: INFO 2022-10-28 16:50:35 - main - . event_no
graphnet: WARNING 2022-10-28 16:50:36 - ParquetDataset._remove_missing_columns - Removing the following (missing) truth variables: interaction_time
0%| | 0/1 [00:06<?, ? batches/s]
Traceback (most recent call last):
File "/cvmfs/icecube.opensciencegrid.org/py3-v4.1.0/RHEL_7_x86_64/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/cvmfs/icecube.opensciencegrid.org/py3-v4.1.0/RHEL_7_x86_64/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/misc/home/aske/.vscode-server/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in
(https://github.com/scikit-hep/awkward-1.0/blob/1.10.1/src/libawkward/array/RecordArray.cpp#L792)
from what I can see we are asking for a number which should be in between [0, n_events].
from what I can see we are asking for a number which should be in between [0, n_events].
Yes, if there is no selection applied. So we probably need self._indices to be in [0, n_events[, but may not be "dense," rather than be a list of event_nos.
I checked that I was able to reproduce your error by setting selection
to something non-sequential, like selection=[1,2,4,8,...]
, and the PR in #332 removes the resulting error.
The bug happens when trying to read or write a parquet file. It would seem that in parquet_dataset._query_table, it expects an event number which is then to be turned into a sequential index by referring to the index. However the index variable when I run it seems to already be a sequential index (not event number) the below change seems to be working for me, but I do not know about the possible knock on effect, I imagine that this function is called several places.
def _query_table( self, table: str, columns: Union[List[str], str], index: int, selection: Optional[str] = None, ) -> List[Tuple[Any]]:
Check(s)
Expected behavior sequential_index = self._indices[index] was expected to take an index (possible by event_number) and turn it into a sequential index number
Actual behavior recieves what I deem to be a sequential index number and returns error $index number$ not in index.