mne-tools / mne-bids-pipeline

Automatically process entire electrophysiological datasets using MNE-Python.
https://mne.tools/mne-bids-pipeline/
BSD 3-Clause "New" or "Revised" License
140 stars 67 forks source link

MAINT: Use a large resource class to speed up build #741

Closed larsoner closed 1 year ago

larsoner commented 1 year ago

Interested in trying a large resource class since ERPCORE* (e.g. ERN) appears to be CPU-bound with 2 CPUs. First I'm pushing an empty commit just to make sure CIs are green and replicate the ~19 minute timing. Then we'll try a large resource class to see if it helps!

hoechenberger commented 1 year ago

I'd like to ensure we keep the same amount of memory though. These resource limits have helped me discover inefficient memory utilization in the past

larsoner commented 1 year ago

That would be a bit harder. We could run with ulimit maybe. Let's see if the speed gains are there first. Then we can decide if the tradeoff of increased memory limit is worth it.

Actually thinking about it the n_jobs to dask have a fixed limit of 2gb in their config so it should be safe enough in those jobs at least -- and those are the ERP CORE ones that will probably benefit most!

larsoner commented 1 year ago

Okay that did take it from 35 minutes down to 30, which is nice. Looks like all the ERP_CORE got faster, and it should be memory-safe to increase the resource class here since (for parallel at least) we use a memory limit on the dask workers:

no-op commit resource-class: large
Screenshot from 2023-06-15 13-33-08 Screenshot from 2023-06-15 13-33-18

There wasn't much effect for the others (e.g., ds000248) though, so I'll revert those then merge.

larsoner commented 1 year ago

... actually 1810 went from 12 to 8 min which seems worth it, too. So the effect is that just one of the tests (1810) is run no longer as memory limited, which seems worth it