ME-ICA / tedana

TE-dependent analysis of multi-echo fMRI
https://tedana.readthedocs.io
GNU Lesser General Public License v2.1
159 stars 94 forks source link

How long for Tedana to complete? #254

Closed Indusjazz closed 4 years ago

tsalo commented 5 years ago

Could you clarify what you mean by that? Are you referring more to the road map for the project (i.e., when will the tedana package be stable and relatively "complete") or to the typical duration of the workflow (i.e., how long does it take for the tedana workflow to finish on a typical dataset)?

Indusjazz commented 5 years ago

Hi Sorry for my belated response. I wanted to run it again to be sure. I was referring to the workflow. I can see that it takes a humongous amount of memory on my machine, pretty much all of it in fact. Can’t do anything else besides running tedana. Last time I tried it even crashed. I was wondering whether this was normal or not. I’m planning on running tedana on a more powerful machine, but it still is impressive.

tsalo commented 5 years ago

The current version of tedana should only use one core (and so shouldn't overload your computer), but I don't think that's been released on PyPi yet. What version are you using?

Indusjazz commented 5 years ago

Using tedana 0.0.6 on Python 3.7.1. If this helps, I receive a bunch of warnings when I launch tedana, though they don’t seem to be the cause of the problem since tedana gets to run…

Here’s what I get on my terminal:

Running tedana Module duecredit not successfully imported due to "No module named 'duecredit'". Package functionality unaffected. /miniconda3/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 216, got 192 return f(*args, kwds) /miniconda3/lib/python3.7/importlib/_bootstrap.py:219: ImportWarning: can't resolve package from spec or package, falling back on name and path return f(*args, *kwds) /miniconda3/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject return f(args, kwds) /miniconda3/lib/python3.7/site-packages/pywt/_utils.py:6: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import Iterable INFO:tedana.workflows.tedana:Using output directory: /Users/nib19005/Desktop/2019/TMS/BIDS/TMS_BIDS3/Bids/01/MotorLoc_AP INFO:tedana.workflows.tedana:Loading input data: ['MotorLoc_AP_nonlin_TE1.nii.gz', 'MotorLoc_AP_nonlin_TE2.nii.gz', 'MotorLoc_AP_nonlin_TE3.nii.gz'] INFO:tedana.workflows.tedana:Computing adaptive mask INFO:tedana.workflows.tedana:Computing T2* map INFO:tedana.combine:Optimally combining data with voxel-wise T2 estimates INFO:tedana.decomposition.eigendecomp:Computing PCA of optimally combined multi-echo data INFO:tedana.decomposition.eigendecomp:Making initial component selection guess from PCA results INFO:tedana.model.fit:Fitting TE- and S0-dependent models to components tedana.sh: line 5: 77080 Killed: 9 tedana -d MotorLoc_AP_nonlin_TE1.nii.gz MotorLoc_AP_nonlin_TE2.nii.gz MotorLoc_AP_nonlin_TE3.nii.gz -e 0.016 0.03779 0.05958

jbteves commented 5 years ago

@Indusjazz we're sorry that you're having this problem. What OS are you using? Can you verify through top or htop how many cores and how much memory tedana is using?

Indusjazz commented 5 years ago

I’m working on a macOS Mojave with one 2.93 GHz Intel core. That may be a contributor. Right now tedana isn’t using a lot of memory (%0.9). However, during the PCA phase it can go up to %150 memory. I should add that right now, ten hours after having started tedana on my datasets, the program is still in the TE- and S0-dependent models to components… Terminal says ICA has failed to converge and making “second component selection guess from ICA results”

I understand my machine isn’t the most powerful ever made, far from it, but I’m still stumped by the level of slowness and struggle compared to other pipelines I’ve run on the same machine (e.g. preFreeSurfer, FreeSurfer and PostFreesurfer as well as fMRIVolume pipelines in HCP). Also ran ME-ICA on the same datasets without any issues. These are all completed within the usual time and I can still work on other programs in parallel. In contrast, when tedana runs I can barely move my windows around…

jbteves commented 5 years ago

When you say 150% of memory, what do you mean? And how much memory does your machine have in total? I’m not sure why it’s a problem but it may help us diagnose what’s happening.

Indusjazz commented 5 years ago

Sorry I was unclear/inaccurate. Went too fast in reading your earlier questions and now I think I can see that the problem indeed lies with the memory.

Top doesn’t mention tedana for some reason. What I can see from my activity monitor (see snapshot attached) is that tedana (listed as python on my memory) chews up all the memory available.

If it is expected for tedana then I simply have no choice but to use a better machine. Just want to confirm that this is normal… I’ve never seen this with any of the earlier pipelines I’ve used (including ME-ICA)

jbteves commented 5 years ago

Unfortunately your snapshot isn’t attached. How many GB of RAM do you have and how large is your data set in MB? Are you running tedana from the command line via ‘tedana.py -e echoes -d data’ or are you running it inside the Python interpreter as a function?

Indusjazz commented 5 years ago

Ok. Basically I have 16GB memory and the activity monitor says tedana takes about all of it, so computer just can’t keep up. I have three echo files of about 250-280 MB each. I run tedana as a function inside python (via a bash script).

jbteves commented 5 years ago

Hm. Does the same thing happen if you run normal setup per the installation guidelines and then run it on command line? I can’t think of a reason why that would make a difference but I don’t think it should eat all 16 GB, and I haven’t seen that much get eaten on similarly sized data.

Indusjazz commented 5 years ago

Did what you suggested. I hope you can see the attachment this time. Memory requirement goes insane at times, far beyond what the computer can handle. Is there anything in the pre-processing stages that could make my datasets unmanageable?

jbteves commented 5 years ago

I believe that since you're replying by e-mail the image won't ever attach. You should log in to GitHub and attach it in a reply to this comment, which you should be able to find from here: https://github.com/ME-ICA/tedana/issues/254#issuecomment-485439931

Slice timing and motion correction should be done, as described here. If you're having trouble attaching your screenshot check that out here.

Indusjazz commented 5 years ago

tedana

Indusjazz commented 5 years ago

I had to stop it. The whole thing was going haywire.

The data are pre-processed. I'll try running it on another machine, as I can't really think of any other reason.

jbteves commented 5 years ago

How many echoes and how many voxels along each axis?

Indusjazz commented 5 years ago

Three echo files, voxels: 91 x 108 x 91.

jbteves commented 5 years ago

And sorry, how many time points?

Indusjazz commented 5 years ago

188

jbteves commented 5 years ago

If you're using 0.0.6 then your data may not be masking out non-EPI parts of the image, which can cost a lot of memory. Additionally, 91x108x91 looks like MNI space; did you normalize this data? If you did, you should try just slice timing and motion correction in native space, then attempt to run tedana. If problems persist, you might try cloning the current master branch of the repository and running that version instead. Please let me know if I can help with any of this.

Indusjazz commented 5 years ago

Ok. Thanks for the advice. I’m now trying it on another, more powerful machine and it seems to be working much faster. Just out of curiosity: How long would it usually take for tedana to run on this kind of datasets?

I’ll keep you posted. Thanks again! Nicolas

cjl2007 commented 5 years ago

Just wanted to chime in and say I initially had a similar experience as @Indusjazz . I now run tedana on a cluster, I usually have to allocate ~ 50 GB of RAM to run a 5 echo dataset with 640 TRs and 2.4 isotropic voxels. In terms of how long it takes to run and why it takes that long, the experts here know much better than me, but I have found it depends greatly on how long it takes the ICA to converge (usually several hours) . Also - which PCA dimensionality reduction step (MLE, kundu, kundu-stab.) you select impacts run time. I have found that the kundu-stabilize appears to do a good job picking out meaningful noise components and runs in a fraction of the time.

jbteves commented 5 years ago

Thanks for your input @cjl2007. I have to say, we can reduce the memory requirements as much as possible but that is an extremely large data set! If you or @Indusjazz are interested, you can try the latest version of this repo and see if that helps. It uses nilearn to automatically create an EPI mask, which should drastically reduce memory requirements since many fewer voxels must be tracked. It would be great if this makes tedana run better on smaller machines. You would basically just follow the directions given here to set up developer python; you wouldn't need to fork to your profile.

cjl2007 commented 5 years ago

Yes, I suspect my dataset represents a kind of "worst case scenario" in terms of memory requirements. I just figured I would share my experience with a larger dataset as a reference point, in case it was useful for others.

jbteves commented 5 years ago

I see. @cjl2007 and @Indusjazz you can also try supplying your own EPI mask with the --mask option (see here) and see if that helps.

Yes, it's helpful to know, thank you!

Indusjazz commented 5 years ago

Thank you @cjl2007 and @jbteves for these helpful comments. I'm glad to hear I'm not the only one having had similar experiences. Tedana was officially completed on my other machine, so clearly memory requirement is a big criterion. I should add that component estimation also takes a lot of time and CPU, though not so much memory, so this pipeline is pretty high-octane overall. @cjl2007 I was planning on running this on the cluster ultimately, but wanted to run it on a regular machine first. I think supplying an EPI mask to reduce the number of voxels should speed things up dramatically (and it makes more sense than w/o a mask). Will try the new version soon!

jbteves commented 5 years ago

@Indusjazz @cjl2007 I think the latest release (0.0.7) should be available, and should alleviate these problems. If you're interested, you could try updating to this version and letting us know if this helps runtime for your data and machines.

Indusjazz commented 5 years ago

Thank you! Will try this asap and let you know how it works. N.

tsalo commented 4 years ago

@Indusjazz any news?

Indusjazz commented 4 years ago

Hi Taylor,

Thanks for following up, I was meaning to update you on this soon. As far as I can see, everything works A-OK. It still takes a bit of time and memory, but nothing compared with the previous version. Thanks for optimizing this!

I’ll let you know if I run into any further quirks/issues.

-Nicolas

On Jul 15, 2019, at 1:42 PM, Taylor Salo notifications@github.com<mailto:notifications@github.com> wrote:

@Indusjazzhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FIndusjazz&data=02%7C01%7Cnicolas.bourguignon%40uconn.edu%7Cf4a8fa7a57d84110004a08d7094bcbc8%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C636988093471312290&sdata=56UZdFaHUnXJgKTEaNF%2FwMb0ghL6hrRXzWk%2FOZB7wlY%3D&reserved=0 any news?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FME-ICA%2Ftedana%2Fissues%2F254%3Femail_source%3Dnotifications%26email_token%3DALP35B4VCMVIGDB5O2ACPCTP7SZH7A5CNFSM4HFQUHI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ6N5YQ%23issuecomment-511500002&data=02%7C01%7Cnicolas.bourguignon%40uconn.edu%7Cf4a8fa7a57d84110004a08d7094bcbc8%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C636988093471312290&sdata=%2Fc7w%2B1QYD2ADoQpq4ErU83RL%2BUOZF8B1erS%2FB1opK6Y%3D&reserved=0, or mute the threadhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FALP35BYNBYAMPM47JHHT57TP7SZH7ANCNFSM4HFQUHIQ&data=02%7C01%7Cnicolas.bourguignon%40uconn.edu%7Cf4a8fa7a57d84110004a08d7094bcbc8%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C636988093471322280&sdata=iuhwRQw9h5aTJU3xEccw6d0myF4yh9s2xYUtMp9exW0%3D&reserved=0.

emdupre commented 4 years ago

Thanks, @Indusjazz ! Please do open a new issue if you run into any other problems :sparkles: