Open hfxcarl opened 1 year ago
I think the bug can be addressed by settings the requirement for pybids from "pybids < 0.16.1" to "pybids < 0.16.1,> 0.13.2" (basically skipping the problematic version).
Perhaps it would be better to have pybids only index the specific subjects/data that is called with "--participant-label"?
Is this something leveraged in other BIDS Apps? If so, it seems like a good thing to add.
I'm not aware of other apps that index a single participant. It adds a big delay to processing for large datasets.
In Python, you can do it by exclusion, you have to exclude all the other subjects with the ignore kwarg
ignore_pattern = re.compile(r'^(?!.*\/sub-mysubject($|\/))')
layout = BIDSLayout(bids_root, ignore=[ignore_pattern])
That's probably doable within the BIDSLayout initialization step (or even building a regex that covers multiple participants when requested), though I think we should wait to implement it until we have the nipreps-style config object implemented.
Just going to paste in a regex Copilot wrote that should be good for explicitly ignoring any folders except for a list of requested subjects:
re.compile(r'sub-(?!' + '|'.join(excluded_folders) + r')\w+')
Pybids crashes immediately when called in qsiprep-0.16.x. Even from a shell into qsiprep-0.16.1 using Singularity on a Linux workstation and calling pybids causes an immediate crash.
$ singularity shell /opt/SingularityImgs/qsiprep-0.16.1.sif
I see the pybids version included is 0.13.2 (2021-08-20), whereas current pybids is at 0.15.5 (2022-11-08), perhaps it's time to update pybds:
Where I first noticed the error is from trying to run qsiprep with "--bids-database-dir" pointing to a pre-indexed bids-raw layout_index.sqlite (generated using pybids_v0.15.1 from mriqc-22.0.1.sif) for input into qsiprep-0.16.1. I prefer the control of running individual subjects with fixed resources on my Linux 64-core workstation, and staggering individual runs in parallel to maximize resources while also running other container pipelines (fmriprep, mriqc, etc). The reason I want to run with the "--bids-database-dir" option is that otherwise running one subject from a bids-raw folder with >100 subjects triggers pybids each time to index the entire bids-raw folder even though it is only running one specific subject. A bids-raw input folder with ~200 datasets takes 30-40 mins, and therefore adds 30-40 mins unnecessarily re-indexing the entire bids-raw folder! Obviously this is a second issue, related to Issue #217. Perhaps it would be better to have pybids only index the specific subjects/data that is called with "--participant-label"?
Please let me know how I can help solve either issue. Thanks, Carl