yeatmanlab / pyAFQ

Automated Fiber Quantification ... in Python
http://yeatmanlab.github.io/pyAFQ/
BSD 2-Clause "Simplified" License
53 stars 34 forks source link

Better errors when BIDS format improperly specified #303

Closed 36000 closed 2 years ago

36000 commented 3 years ago

An index error is thrown here when BIDS is improperly specified such that no files are found: https://github.com/yeatmanlab/pyAFQ/blob/da585b2af6d2416d3b88789e4c921ac791c7a58a/AFQ/api.py#L323

Instead, we should catch that error, then call a function that progressively builds the restrictions in this bids_layout.get statement until we see which restriction makes it so no files are found.

arokem commented 3 years ago

Could you please say a bit more about this? Do you have an example of what that looks like?

On Wed, Jul 15, 2020 at 11:21 AM John Kruper notifications@github.com wrote:

An index error is thrown here when BIDS is improperly specified such that no files are found:

https://github.com/yeatmanlab/pyAFQ/blob/da585b2af6d2416d3b88789e4c921ac791c7a58a/AFQ/api.py#L323

Instead, we should catch that error, then call a function that progressively builds the restrictions in this bids_layout.get statement until we see which restriction makes it so no files are found.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yeatmanlab/pyAFQ/issues/303, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA46NRQUDHCSW4IEPLFMEDR3XXUBANCNFSM4O2ZEOEA .

36000 commented 3 years ago

For example, if I am missing a dwi file or have missing folder, the expression:

bids_layout.get(subject=subject, session=session,
                                    extension='nii.gz', suffix='dwi',
                                    return_type='filename',
                                    scope=dmriprep)

Will return [] without throwing an error (it did not find anything that matched those requirements). This causes this error when we try to index it / use it:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2645             try:
-> 2646                 return self._engine.get_loc(key)
   2647             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'gtab'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/internals/managers.py in set(self, item, value)
   1070         try:
-> 1071             loc = self.items.get_loc(item)
   1072         except KeyError:

~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2647             except KeyError:
-> 2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2649         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'gtab'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-27-2874795aa489> in <module>
----> 1 afq_hcp_retest(attach_keys(['169343'])[0])

<ipython-input-25-93a7983c30d8> in afq_hcp_retest(args)
     35         seg_suffix='aparc+aseg',
     36         #tracking_params = {"odf_model": "DKI"},
---> 37         virtual_frame_buffer=True)
     38         #scalars=["dti_fa", "dti_md", "dki_fa", "dki_md"])
     39 

~/pyAFQ/AFQ/api.py in __init__(self, bids_path, dmriprep, segmentation, seg_suffix, b0_threshold, bundle_names, dask_it, force_recompute, reg_template, scalars, wm_criterion, use_prealign, virtual_frame_buffer, viz_library, tracking_params, segmentation_params, clean_params)
    381             self.data_frame = ddf.from_pandas(self.data_frame,
    382                                               npartitions=len(sub_list))
--> 383         self.set_gtab(b0_threshold)
    384         self.set_dwi_affine()
    385         self.set_dwi_img()

~/pyAFQ/AFQ/api.py in set_gtab(self, b0_threshold)
   1248             lambda x: dpg.gradient_table(x['bval_file'], x['bvec_file'],
   1249                                          b0_threshold=b0_threshold),
-> 1250             axis=1)
   1251 
   1252     def get_gtab(self):

~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/frame.py in __setitem__(self, key, value)
   2936         else:
   2937             # set column
-> 2938             self._set_item(key, value)
   2939 
   2940     def _setitem_slice(self, key, value):

~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/frame.py in _set_item(self, key, value)
   2999         self._ensure_valid_index(value)
   3000         value = self._sanitize_column(key, value)
-> 3001         NDFrame._set_item(self, key, value)
   3002 
   3003         # check if we are modifying a copy

~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/generic.py in _set_item(self, key, value)
   3622 
   3623     def _set_item(self, key, value) -> None:
-> 3624         self._data.set(key, value)
   3625         self._clear_item_cache()
   3626 

~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/internals/managers.py in set(self, item, value)
   1072         except KeyError:
   1073             # This item wasn't present, just insert at end
-> 1074             self.insert(len(self.items), item, value)
   1075             return
   1076 

~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/internals/managers.py in insert(self, loc, item, value, allow_duplicates)
   1179         new_axis = self.items.insert(loc, item)
   1180 
-> 1181         block = make_block(values=value, ndim=self.ndim, placement=slice(loc, loc + 1))
   1182 
   1183         for blkno, count in _fast_count_smallints(self._blknos[loc:]):

~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/internals/blocks.py in make_block(values, placement, klass, ndim, dtype)
   3045         values = DatetimeArray._simple_new(values, dtype=dtype)
   3046 
-> 3047     return klass(values, ndim=ndim, placement=placement)
   3048 
   3049 

~/miniconda3/envs/afq/lib/python3.7/site-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
    123         if self._validate_ndim and self.ndim and len(self.mgr_locs) != len(self.values):
    124             raise ValueError(
--> 125                 f"Wrong number of items passed {len(self.values)}, "
    126                 f"placement implies {len(self.mgr_locs)}"
    127             )

ValueError: Wrong number of items passed 6, placement implies 1

We should try to give more information on which filter in bids_layout.get() caused the list to be empty. I wonder if bids_layout.get() has a mode that reports this info, but what I normally do in command line to diagnose the problem is run this serious of expressions until I get []:

bids_layout.get(subject=subject)
bids_layout.get(subject=subject, session=session)
bids_layout.get(subject=subject, session=session,
                                    extension='nii.gz')
# etc....
36000 commented 3 years ago

Another issue. If I think I am providing a seg file, but it does not find any, it will only error when I call _wm_mask and the criterion is an array instead of a threshold:

wm_mask = dti_fa > self.wm_criterion
ValueError: operands could not be broadcast together with shapes (145,174,145) (10,)

We should throw an error earlier if you have provided a wm_criterion that expects a seg file but none was found. We could also print to INFO which files we find.

36000 commented 3 years ago

For the initial issue, it might be nice to have a wrapper around the bids_layout.get function that simply checks if it returns an empty list, and if it does, checks which filter is causing the error, then prints that information (ie. "No files found with suffix dwi"), optionally throwing a file not found error. I can write a PR which does this.