bids-apps / MRtrix3_connectome

Generate subject connectomes from raw BIDS data & perform inter-subject connection density normalisation, using the MRtrix3 software package.
http://www.mrtrix.org/
Apache License 2.0
49 stars 26 forks source link

Runtime Error #15

Closed ddshin closed 7 years ago

ddshin commented 7 years ago

I converted a single band DTI set and a pair of TOPUP scans into BIDS data structure using dcm2nii (passes BIDS validation - see screenshot below).

capture

The same set can be processed successfully by the ndmg DTI app.

During the docker run, I am running into runtime errors (see below).

capture2

Any idea how this can be avoided? Happy to share the data for testing/debugging if needed. Thanks.

Lestropie commented 7 years ago

Hi David,

Thanks for the report.

One aspect of writing this BIDS app that has proven quite difficult is dealing with performing EPI distortion correction without the possibility of user intervention. This requires determining the phase-encoding contrast within the DWI acquisition, and setting up the FSL eddy / topup calls, based on the contents of the BIDS dataset alone, accounting for the wide range of possible acquisition schemes. In this case, one of the "patches" I've had to put in the script is the manual addition of a "diffusion gradient table" (corresponding to b=0) to the images in the fmap/ directory as they are imported; unfortunately I've neglected to take into account the possibility that there may be multiple volumes in each of those images. So I'm confident that I can see the problem from the log you've provided alone, and it shouldn't be too difficult for me to fix.

I have a bunch of other changes that I was about to merge & tag, so I'll fix this and push it out hopefully soon.

Cheers Rob

ddshin commented 7 years ago

Thanks, Rob!

Would the release also address the second error, i.e. "Number of lines in gradient table (1) does not match input image; check your data"?

I get this error both with and without the field map data.

Very much looking forward to the newest release!

Lestropie commented 7 years ago

Based on looking at the code alone, the "Number of lines in gradient table (1) does not match input image; check your data" error should only occur in the case where images are provided in the fmap/ directory that contain more than one volume.

If you attempt to run the script with only one image in the dwi/ directory, and no images in the fmap/ directory, then you should get a different error: "Inadequate data for pre-processing of subject 'sub-01': No phase-encoding contrast in input DWIs or fmap/ directory". Because the script relies on the use of ACT, and ACT requires correction of EPI inhomogeneity distortions, the script will not proceed if it cannot find sufficient information with which to perform this correction.

By the way: If you simply modify the images in your fmap/ directory so that each only contains one volume, the script will hopefully proceed as normal.

ddshin commented 7 years ago

Thanks for the suggestion, Rob. I created a new bids data set after keeping only the first rep of volumes from both sets of fieldmaps (TOPUP fwd and rvs). After validating the set (see below), I passed it to the app.

capture

As you guessed, I am no longer getting the errors I cited above. However, I am running into another runtime error (see below).

capture2

FYI, I ran the app on my Mac using 8GB of RAM. I reran with 12GB of RAM but got the same error.

Lestropie commented 7 years ago

@ddshin: There's a new tag (0.2.0) which should fix the issue with multiple-volume images in the fmap/ directory. Still waiting for it to be pulled to Docker Hub, but the changes are there in the repo if that's where you're getting the code from.

Your current error appears to be something to do with executing within the container environment; it's not anything specific to this particular script, over and above the fact that dwipreproc takes some time to run and is not generating any output, and therefore may be triggering some form of time-out. Unfortunately I'm no more help than Google is on this topic. The only thing I found was a suggestion to increase the CPU and RAM hardware allocated to the container.

ddshin commented 7 years ago

@Lestropie - Thanks. I pulled the tagged version (0.2.0) and deployed the container on google virtual machine instance using 8 CPUs, 30GB of memory and 50GB of persistent disk space. The process went much farther than before but it eventually returned another error (see below).

screen shot 2017-09-08 at 3 03 00 pm screen shot 2017-09-08 at 3 02 40 pm

How can we avoid this error? Also, is it normal that the output folder is empty at this point in the process?

Thank you.

Lestropie commented 7 years ago

OK, so there's a bit going on in there:

My detection of "low-resolution" T1s is a little too strict (trying to detect instances where users have erroneously resampled their T1s to match their DWIs, which introduces other issues), so I'll back that off. This requires a change within MRtrix3, but I'll need to send an update to MRtrix3_connectome in order to redirect it to the updated code.

As to the crash itself, it's a slightly unusual one. The initial warning message: mrcalc: [WARNING] header transformations of input images do not match is a very unusual one to encounter at this point, since all of the input images in that particular command are essentially derived from the same source. So my suspicion is that this is a red herring, caused by the same underlying issue as the actual subsequent error: mrcalc: [SYSTEM FATAL CODE: SIGBUS (7)] Bus error: Accessing invalid address (out of storage space?) This suggests that the destination drive - in this case the container root directory - does not have enough storage space for all of the temporary images that are generate during the course of the execution of this script (and the container script as a whole). While I could probably be more aggressive with my deletion of temporary files within the 5ttgen fsl script specifically, in order to free storage space as the script runs, the MRtrix3_connectome script is already designed to delete any generated files as soon as they are no longer required. Therefore there's a good chance that making 5ttgen fsl use slightly less room will only result in exactly the same error occurring again further along in the script.

What I would suggest trying here is the following:

docker run -it --rm -v /home/David/data/170808_bidsdti_onerep:/bids_dataset -v /home/David/data/outputs:/outputs --entrypoint=/bin/bash bids/mrtrix3_connectome:0.2.0

Then from within the container:

./run.py /bids_dataset /outputs participant --participant_label 01 --parcellation fs_2005

By running it in this way, you will still be logged in to the container instance when the script fails, and therefore you will be able to poke and prod around; e.g. to check disk usage or the like.

Lestropie commented 7 years ago

Also, is it normal that the output folder is empty at this point in the process?

Yes, output files are only copied / converted to the output directory if processing for the subject is completed.

ddshin commented 7 years ago

The crash in fact was caused by a lack of disk space in the container root directory. I created a large space on the mounted disk (where input data set was and the output folder should go) but not on the root directory where the docker image was pulled from. Having addressed the space issue, I was able to let the process go through to the end or almost to the end (see below). Unfortunately, I don't see the output folder in the specified path.

screen shot 2017-09-11 at 12 43 26 am

ddshin commented 7 years ago

Hi Rob - never mind. It looks like I screwed up with the docker run command, i.e. I didn't specify the output path correctly. I am rerunning the process again now. I will let this run overnight and let you know once the process is completed. I think I am getting close. Thanks for troubleshooting with me thus far.

Lestropie commented 7 years ago

No, I suspect you're going to hit the same issue with the if not app.args.cleanup line. That'll be due to some changes I made in 0.2.0, that weren't tested because FreeSurfer takes too long to run as part of a CircleCI test. I'll have to fix that and do a 0.2.1 update.

Something I've been working on recently is automated lint testing of Python scripts in MRtrix3. This is precisely the sort of error that keeps cropping up again and again that I'm hoping that sort of testing will catch...

ddshin commented 7 years ago

Okay, please let me know when I can pull the 0.2.1 release. Happy to give this another go afterward.

I am using 8 high performance cores to run the app. Do I need to invoke the -n option (e.g. -n 8) to ensure that the app will use all the available CPUs or is this done automatically?

Lestropie commented 7 years ago

In the absence of an explicit command-line option, MRtrix3 apps will use as many threads as the system reports as being available. So multi-threading is done at that level. If you specify multiple subjects, these will be run sequentially only, one after the other; multi-threading across subjects should instead be handled via e.g. a job scheduler.

One of the changes I need to make in this context is updating FreeSurfer to version 6 and activating multi-threading of the recon_all -all call; currently that step will only run single-threaded.

ddshin commented 7 years ago

I see. Thanks for clarifying that.

FreeSurfer multi-threading support would be great as it will dramatically lessen the processing time, which right now is quite long.

My plan is to introduce the BIDs infrastructure and apps to all our users (MRtrix3_connectome being one of the core apps since DTI is something almost everybody acquires), then building the IT infrastructure to support this. So anything that saves time per docker instance will lead to a big time/cost savings.

In that context, another feature I would love to see is a HTML report in the output folder similar to what FMRIPREP and MRIQC generate. Maybe MRtrix3_connectome already does this and I just haven't seen the output folder yet.

chrisgorgo commented 7 years ago

Many BIDS Apps (including MRTrix3_connectome) can be run for free in the cloud using the http://OpenNeuro.org platform. It might be an interesting option for your users.

ddshin commented 7 years ago

@Lestropie - When can we expect to see the 0.2.1 update released? Thanks.

ddshin commented 7 years ago

@chrisfilo - What happens to the data sets uploaded to OpenNeuro once the data processing is completed? Are they automatically shared publically via the OpenNeuro data repo?

Is it possible to upload raw data via a CLI rather than point/click GUI on the browser?

chrisgorgo commented 7 years ago

On Wed, Sep 13, 2017 at 12:06 PM, David Shin notifications@github.com wrote:

@chrisfilo https://github.com/chrisfilo - What happens to the data sets uploaded to OpenNeuro once the data processing is completed? Are they automatically shared publically via the OpenNeuro data repo?

Yes eventually. They are not made public immediately, but after 18 months (1,5 year).

Is it possible to upload raw data via a CLI rather than point/click GUI on the browser?

Currently this is not supported, but the GUI upload supports resuming uploads for wanky connections.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BIDS-Apps/MRtrix3_connectome/issues/15#issuecomment-329266401, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOkpx9bppm7Li4Nf3BPMbYGcwmPkhlgks5siCesgaJpZM4PN6eN .

Lestropie commented 7 years ago

@ddshin: Working on it. Will need to run it through a FreeSurfer pipeline test, since I can't include that as part of the automated testing. I'll list you know when it appears on DockerHub; should be early next week.