cctbx / cctbx_project

Computational Crystallography Toolbox
https://cci.lbl.gov/docs/cctbx
Other
210 stars 111 forks source link

Xfel match directories #888

Closed Baharis closed 1 year ago

Baharis commented 1 year ago

By default,cctbx.xfel.merge requires that input expts and refls reside in the same directory. Whether the input is specified using directory paths or file globs, the current file_lister matches expts and refls in the following way:

for refl_path in input_paths:
    if refl_path.endswith(reflection_suffix):
        expt_path = refl_path - reflection_suffix + experiment_suffix
        yield expt_path, refl_path

In my particular case, I need cctbx.xfel.merge to read input expts and refls from two different directories. To this aim, I added a phil parameter input.match_directories and implemented a second file_lister matching method, which does not require the files to be in the same directory BUT requires the filenames to be unique across all input directories:

for path in input_paths:
    if path.endswith(experiment_suffix):
        expt_paths.append(path)
    if path.endswith(reflection_suffix):
        refl_paths.append(path)
for expt_path, refl_path with matching names in zip(expt_paths, refl_paths):
   yield expt_path, refl_path

I have already implemented this feature here for my own use, but I would like to ask if this functionality is potentially useful for anyone other than me. Or should we refrain from merging this change to master since the case is too specific?

Baharis commented 1 year ago

@irisdyoung , @dwpaley , @nksauter , @phyy-nx , @vganapati Little PR suggestion concerning our today's discussion.

Baharis commented 1 year ago

As suggested by @phyy-nx , reading from the same or different directories should be decided automatically and not via a dedicated phil parameter. Closing this PR and creating a new branch + PR to suggest this functionality.