poldracklab / fitlins

Fit Linear Models to BIDS Datasets
https://fitlins.readthedocs.io
Apache License 2.0
73 stars 30 forks source link

Very high memory usage during startup #241

Open effigies opened 4 years ago

effigies commented 4 years ago

Using a 21-subject dataset, 8 runs per subject, and constructing workflows to only run on 4 subjects, I get memory usage of ~14GiB before the workflow even starts running.

adelavega commented 4 years ago

yikes, what's the culprit? pybids?

effigies commented 4 years ago

Not sure yet. Mostly working on CIFTI-2 support, so making notes of issues to come back to.

I am using PyBIDS master and we did revert some one of the speedups because it allowed databases to randomly get into inconsistent states. My impression is that's CPU, not memory, but if it's keeping a ton of refs that it should be dropping while the CPU churns away, that could cause some problems.

tyarkoni commented 4 years ago

Could it be a result of https://github.com/bids-standard/pybids/pull/637? Seems unlikely (I don't see anywhere pybids could be holding anything open), but the other PRs seem even less likely to have introduced a change in memory consumption, and I imagine if this was an old issue, we would have encountered it before.

You could also try disabling metadata indexing just to rule out that it's coming from anything related to that.

effigies commented 4 years ago

It seems unlikely, as I don't have any GIFTI files fetched in the dataset. CIFTI and NIfTI only read the headers on load, so it should be pretty cheap.