PennLINC / fw-heudiconv

Heuristic-based Data Curation on Flywheel
BSD 3-Clause "New" or "Revised" License
6 stars 11 forks source link

convert.py regex module error in identifying pattern for sessions #40

Closed bbiney73 closed 5 years ago

bbiney73 commented 5 years ago

We're python newbies and we keep getting the following error message. Note that we updated to heudiconv development version 19 (in light of a previous issue). There's only one session number, it being a timestamp for when the dicom file was scanned

$ fw-heudiconv-curate --project TNI --heuristic /Users/imaginglab/Desktop/heuristic.py --subject TNI.SC.1.015 --session 20170615-151309 

/Users/imaginglab/miniconda3/envs/flywheel/lib/python3.7/site-packages/fw_heudiconv/query.py:4: UserWarning: The DICOM readers are highly experimental, unstable, and only work for Siemens time-series at the moment

Please use with caution.  We would be grateful for your help in improving them

  from nibabel.nicom.dicomwrappers import wrapper_from_data

INFO: Querying Flywheel server...

INFO: Loading heuristic file...

INFO: Applying heuristic to query results...

INFO: Applying changes to files...

WARNING: Trouble updating intentions for this session 20170615-151309

'NoneType' object has no attribute 'group'

Traceback (most recent call last):

  File "/Users/imaginglab/miniconda3/envs/flywheel/lib/python3.7/site-packages/fw_heudiconv/convert.py", line 231, in confirm_intentions

    ses_labs = [re.search(r"ses-[a-zA-z0-9]+(?=_)", x).group() for x in full_filenames if x is not None]

  File "/Users/imaginglab/miniconda3/envs/flywheel/lib/python3.7/site-packages/fw_heudiconv/convert.py", line 231, in <listcomp>

    ses_labs = [re.search(r"ses-[a-zA-z0-9]+(?=_)", x).group() for x in full_filenames if x is not None]

AttributeError: 'NoneType' object has no attribute 'group'

This error message is really hard to parse given that none of us fully understand regular expressions. We've been searching and I've been looking at the code in convert.py but...

my guess is that line 231 is searching through the strings in full_filenames to detect a pattern of "ses-" then any pattern of letters or numbers. Again, I don't have a full understanding of regular expressions.

I printed full_filenames just to make sure that pattern actually existed and... Here's some of the contents of full_filenames

['dwi/sub-TNI.SC.1.015_ses-20170615-151309_dwi.nii.gz', anat/sub-TNI.SC.1.015_ses-20170615-151309....]

It seems like the pattern for "ses-(some range of digits)" exists in these strings. So I'm not certain why re.search() is returning a nonetype object.

Here is our heuristic.py code.

import os

def create_key(template, outtype=('nii.gz',), annotation_classes=None):
    if template is None or not template:
        raise ValueError('Template must be a valid format string')
    return template, outtype, annotation_classes

t1w = create_key('sub-{subject}/{session}/anat/sub-{subject}_{session}_T1w')
dwi_258dir = create_key('sub-{subject}/{session}/dwi/sub-{subject}_{session}_dwi')
rest = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest_bold')
def infotodict(seqinfo):
    """Heuristic evaluator for determining which runs belong where
        allowed template fields - follow python string module:
        item: index within category
        subject: participant id
        seqitem: run number during scanning
        subindex: sub index within group
    """

    last_run = len(seqinfo)

    info = {
        t1w: [],
        # dwi
        #    dwi_258dir: [],

        #rest_bold
        #    rest: [],
    }

    for s in seqinfo:
        protocol = s.protocol_name.lower()

         # Baseline Anatomicals
        if "t1w_mpr" in protocol:
            info[t1w].append(s.series_id)

        # # # DWI scans
        elif "DSI_1.8mm_257dir_b5000_mb3" in protocol:
            info[dwi_258dir].append(s.series_id)
        # #
        # # #resting task scans
        elif "restingBOLD_mb6_1200" in protocol:
            info[rest].append(s.series_id)

    return info

So we really don't have the domain knowledge to understand what the error message we previously listed means. And we're not sure what to modify or what to try as next steps.

bbiney73 commented 5 years ago

Issue with how the sessions were labelled when data was uploaded into flywheel. The moral of the story is that a session must be denoted by "ses-{alphanumericstring}"