European-XFEL / EXtra-data

Access saved EuXFEL data
https://extra-data.rtfd.io
BSD 3-Clause "New" or "Revised" License
7 stars 14 forks source link

Add optional index group filter to SourceData.one_key() #526

Closed philsmt closed 5 months ago

philsmt commented 5 months ago

During the offline data reduction trials, calls to SourceData._get_index_group_sample() ended up being a significant performance problem, as the current implementation triggers a call to the fairly expensive SourceData.keys().

Initially this was fixed by overwriting this method only within exdf-tools, this MR means to upstream these changes now.

Since the improved implementation is based on SourceData.one_key(), I decided to augment this method by an optional keyword argument index_group and entirely replace the usage of _get_index_group_sample().

Something I noticed in the implementation is that FileAccess.get_one_key() (and in extension SourceData.one_key()) may technically return None if a source has no keys. While the selection mechanism prevents no keys to be selected, it does not seem to prevent files with key-less sources. I would not expect the current DAQ or calibration to create such files, but it's not impossible for someone external to prepare them. For now I decided to not change any behaviour, but preserve the one from _get_index_group_sample() and raise an exception if asking for a non-existing index group.

philsmt commented 5 months ago

I separated the changes to FileAccess and SourceData in separate commits for easier review.

takluyver commented 5 months ago

Other than that one comment, LGTM.

I agree that it makes sense to throw an error for a non-existent index group. :+1:

philsmt commented 5 months ago

I added two additional minor changes for FileAccess:

Will merge later today unless something comes up.