nel-lab / mesmerize-core

High level pandas-based API for batch analysis of Calcium Imaging data using CaImAn
Other
58 stars 15 forks source link

Run items using SLURM; save batches with platform-independent paths #298

Closed ethanbb closed 1 month ago

ethanbb commented 3 months ago

This implements the "slurm" backend via _run_slurm. Supports passing a partition or list of partitions on which to run the jobs; more options (such as memory allocation) could be added if we think they're important.

For controlling number of CPUs/processes per job, right now I'm using the MESMERIZE_N_PROCESSES environment variable to indicate how many processes to use per job, which matches the behavior of the "subprocess" backend. However, it might be a good idea to instead divide this number by the number of jobs that are running in parallel, to avoid needlessly over-parallelizing each job. I'm doing this in my own code to set the environment variable and it works fine, but it may make sense to automate it.

This also adds a dependency on filelock to lock the batch file when updating it to avoid race conditions. I used the SoftFileLock because the regular FileLock wasn't working for me on a NTFS remote drive (from Linux). As I understand it, this basically just creates a lock file when acquiring and deletes it when releasing, which also wouldn't be hard to implement ourselves if you prefer not to add a dependency.

ethanbb commented 3 months ago

Changed to draft because I'm actually still refactoring things as part of the cross-platform saving fix

ethanbb commented 2 months ago

tox-dev/filelock#331 is currently a blocker... could probably sidestep but I'm hoping they fix it quickly (for now)

ethanbb commented 2 months ago

This now also fixes #209 - sorry for the lack of clean separation between them. Locking the batch file while writing was involved in both sets of changes.

ethanbb commented 2 months ago

I made some significant changes to put the saving functionality in the caiman dataframe extension and streamline locking the file where necessary - please take another look when you can!

ethanbb commented 2 months ago

Hmm checks seem to be failing at the install step - not sure if it's due to these changes or something else is going on.

kushalkolar commented 2 months ago

have a few comments, sorry been busy with a big fastplotlib release. You can always always ping me with @ if I forget

ethanbb commented 2 months ago

No worries, I've been working on other things as well.

ethanbb commented 2 months ago

@kushalkolar done making changes for now; see remaining threads for questions I still have