bids-apps / HCPPipelines

A BIDS App for minimal preprocessing using the HCP Pipelines
33 stars 30 forks source link

BIDS-Model Event files to FSL EV's #36

Open kastman opened 6 years ago

kastman commented 6 years ago

Hi @chrisfilo and @effigies ,

I'm starting to work on some new tasks for HCP and thinking about how to get the fMRIVolume and fMRISurface HCP stages processed using the BIDS-apps container.

The main sticking-point that I see is converting the BIDS event files (long-form run/duration/onset) to FSL EV files (vectors per condition) for HCP. I played around with fitlins and the BIDS model extension a few months ago, but didn't get very far before being pulled away to other things.

Do you think the model spec is mature enough to add as an extension to the run.py adapter you have in this container, e.g. including a bids-model json file, converting .tsv -> ev, and using those ev's in the HCP fMRI stages? Or does this sound like more work than it's worth and something that other people wouldn't necessarily use, and I should just focus on shoe-horning in precomputed ev's to a standard HCPPIPE (non-bids) container?

I suspect it would be straight-forward if fitlins has code I could piggy-back on to easily create the EVs (as opposed to estimating the whole thing), but not trivial if it doesn't. I poked around and didn't see anything, but maybe there's another package that you know of that would do this? Or maybe, that would actually be the more useful (to the community)?

Anyway, I'm probably going to take a crack at this and just wanted to open the issue early. Any suggestions along the way would be appreciated. Thanks!

chrisgorgo commented 6 years ago

Definitely doable, but not on top of my priorities. Happy to review a PR if you end up implementing this.

effigies commented 6 years ago

So FitLins is largely wrapping work that's currently in PyBIDS.

If you're comfortable reading nipype interfaces, here is the interface that loads a BIDS model and creates structures that can be used outside.

https://github.com/poldracklab/fitlins/blob/ee05910bddc59449a0db1ae6f4605be026d8077c/fitlins/interfaces/bids.py#L134-L361

For now, it would probably make the most sense to treat that as a walk-through of PyBIDS's Analysis object, and do any conversion to FSL's EV format by hand.

If you're interested in trying to adapt nipype.interfaces.fsl.Level1Design to accept the outputs of that interface to produce EV/FSF files, that would provide extremely useful feedback on my design, and we could work to resolve any impedance mismatch. (A long-term goal of mine is to have the BIDS model interfaces produce specifications that can be easily used with Nipype's FSL and SPM interfaces, as well as the nistats interfaces I'm currently working on.)

kastman commented 6 years ago

Thanks both; I'll take a shot at it. Looks like fitlins.interfaces.LoadBidsModel is pretty close to what I'd need, and all that I'd really need to build is the bridge interface from that to nipype.algorithms.modelgen.SpecifyModel which can go to either SPM or FSL.

If I make progress or have questions I'll ping again. Thanks,

kastman commented 6 years ago

Also, do you have any tips on developing the run.py inside the container? Are you rebuilding the container for every change of run.py? Or is there an easy way to mount the host's run.py instead of copying it and rebuilding?

effigies commented 6 years ago

So LoadBIDSModel is meant specifically to replace SpecifyModel, because all of the inputs to SpecifyModel should be defined by the BIDS model JSON file and the event files. However, it's not a 1-1 mapping on the old form, so I wasn't able to get the outputs the same as SpecifyModel, which is why we'll need to adapt Level1Design.

And assuming you're using Docker for your container, you can just do something like:

docker run --rm -it -v $PWD/run.py:/path/in/container/run.py image/name:tag <args>
kastman commented 6 years ago

I've gotten fmriVolume and fmriSurface processing to work successfully, and am now about to start the real modeling.

Before going too far, I just wanted to check that adding both nipype and fitlins as dependencies within the bids/hcppipelines container would be alright. Seems silly to basically re-create the work of Level1Design inside this run.py wrapper, but I'm hesitant to make the image any larger. Would it be ok to add them?

effigies commented 6 years ago

I think that's reasonable. I would pin the specific fitlins, pybids and grabbit versions you're currently using, though, as there are going to be some fairly major changes coming up, as the BIDS draft specs start to solidify.

Nipype should be more stable, but probably not a bad idea to pin in your container, anyway.

kastman commented 6 years ago

Hi @effigies -- I've had a chance to look at this and am looking for input on a few different design options for converting LoadBIDSModel session_info to the fomrat that nipype expects, e.g. in fsl.Level1Design:

  1. Add nipype_design outputs to LoadBIDSModel in fitlins. In this case, I would add new outputs directly to the interface in the fitlins, so both fitlins and nipype style session_info's are exposed within the same interface. This seems intuitive and would probably be the least amount of work, but is a bit confusing (e.g. as you know, fitlins uses the term session_info for its own format which is totally different than the session_info nipype interfaces are expecting, so there would be explanation / commenting that would be required (explicitly instruct users: "If using nipype interfaces, uses nipype_session_info and don't use session_info). Pluses to this would be that it any further work for features or bug fixes you do in fitlins would be carried along as well.

The other options involve work within nipype and not fitlins:

  1. Add an XOR fitlins_session_info input to Level1Design that takes LoadBIDSModel-style session_info instead of modelgen.SpecifyModel session_info, or add contingent checks to the existing session_info input that can take either type of session info. Advantages are that future work in LoadBIDSModel will be used; disadvantages are that we're now adding code specific to fitlin's format to nipype itself and that the interface could potentially get confused and busy by doing different things.

  2. Add an intermediate interface (to either fitlins or nipype) in between LoadBIDSModel and Level1Design to translate the two session_info styles.

  3. Make a totally independent nipype interface to load BIDS model info in the standard nipype-style session_info. Advantage are that anyone can take advantage of BIDSModel code without requiring fitlins, but this would now fork similar code doing the same thing in two related projects.

Hope all of this makes sense, but I'm having a tough time deciding which of those options would be the cleanest and easiest. I haven't looked at nistats but being interoperable with that would also be a consideration as well. Any suggestions / leaning toward one option vs. another? Thanks,

effigies commented 6 years ago

The output of LoadBIDSModel can be adjusted. The main constraint is what information I have available.

I have some documentation I've been writing, but haven't yet finished and committed:

Outputs
-------
session_info : list of dictionaries
   At the first level, a dictionary per-run containing the following keys:
       'events'    : HDF5 file containing sparse representation of events
                  (onset, duration, amplitude)
       'confounds' : HDF5 file containing dense representation of confound
                     regressors
       'repetition_time'   : float (in seconds)

contrast_info : list of lists of files
    A list of files for each level of analysis (length depends on model)
    An HDF5 file per analysis unit (e.g. run, subject, etc.) containing a pandas matrix
    with a row for each contrast output.
    The columns correspond to input files, and the entries are weights.
    A final, additional column is added indicating whether the row is a T or F
    contrast.

contrast_indices : list of lists of lists of dictionaries
    A list of lists of dictionaries for each level of analysis (length depends on model)
    A list of dictionaries per analysis unit (e.g. run, subject, etc.)
    Each dictionary contains a set of entities describing

I'll need to look again at how far this is from the SpecifyModel outputs/Level1Design inputs. My long-term intent is to update Level1Design to take the outputs of LoadBIDSModel, but to the extent that we can adjust LoadBIDSModel outputs, I'm equally happy to do that.

The main thing that would be hard to change is the lengths and depth of the lists/lists of lists. For that, Select() interfaces will be useful.

effigies commented 6 years ago

Sorry, just to be clear, I'm posting this documentation in case you haven't dug through the outputs in detail yet. I don't think I'll be able to today, but tomorrow I'll have a look at getting closer back to the SpecifyModel outputs, and see where we absolutely have to diverge. But I wanted to share the information in case you want to look at it sooner.

kastman commented 6 years ago

Thanks for posting that doc; I've glanced at the outputs but hadn't even put together something as detailed as this.

SpecifyModel does everything natively using lists and dicts for conditions, and files for confounds, so there's no sparse df for the events and a list of filenames pointing to dense movement regressors etc., so there's definitely quite a divergence. I like using the hdf5 df's, but I don't think switching to the sparse list of dicts would lose anything in the hdf's right now. However, I haven't thought through all the possible models and whether they can all be represented using the nipype format.

One question on LoadBIDSModel.load_level1 was that each contrast seems to be built per paradigm (condition / event type); not sure if that was intended. Additionally, I had to do some weird handling with the layout.get_space method, which magically appears when a preproc_dir is provided but I can't find it in the pybids master at all, which is klugey and probably has a better solution.

No hurry for today; I wouldn't necessarily have you re-write the LoadBIDSModel outputs. Sounds like maybe exposing a special version for the nipype native interfaces may be the best. I'll look out for any thoughts if you get a chance tomorrow. Thanks!

kastman commented 6 years ago

Last thought - I like the idea of adjusting Level1Design to take the fitlins outputs, but that also means adjusting every package's equivalent of Level1Design (SPM, AFNI...). Maybe be simpler to adjust the fitlins outputs if it's possible, but I think that will depend a lot on edge cases.

effigies commented 6 years ago

Okay, so I've been digging into fsl.Level1Design, and here's what I have for session_info:

[
    {
        'scans': File,  # Image file
        'hpf': Float,   # High-pass filter cutoff (in seconds)
        'cond': [
            {
                'name': Str,
                'onset': Float,
                'amplitudes': Either(Float, List(Float)),
                'duration': Either(Float, List(Float)),
            },  
            ...
            ]
        'regress': [
            {
                'name': Str,
                'val': List(Float)  # May be broader...
            },  
            ...
            ]
    },      
    ...
]

We pass image files separately from session info, which shouldn't be a huge refactor. I don't think pybids currently implements reading HPF cutoffs, but I'd consider that somewhat peripheral (also, I don't think it needs to be defined on a per-run basis). Conditions and regressors correspond to my events and confounds data frames, and should be trivial adaptations, at that.

So if I can get a dictionary-based representation of conditions and regressors that can be easily converted to/from Pandas DataFrames, I think a refactor would be a pretty easy lift.

I still need to look at contrasts. We should be able to similarly adapt the Pandas object for T contrasts. I think this is going to need to wait on 0.7 to be done correctly for F contrasts.

kastman commented 6 years ago

That’s about what I’ve got as well - just unsure about the best way to add this in. Can you think of any reason to store the events and confounds as frames instead of lists besides ease of indexing? Agreed that the conversion should be pretty trivial — I’m happy to add a to_nipype_list to those frames if you think that you really want to switch over the fitlins outputs.

effigies commented 6 years ago

Ah, sorry, I missed your response. No, the only good reason to keep things as DataFrames is that that's what nistats expects. So being able to convert them losslessly (in a numpy.allclose sense) back from whatever intermediate format nipype uses is my only criterion. If we can use a Pandas method, all the better, but if not, no big deal.