aces / cbrain

CBRAIN is a flexible Ruby on Rails framework for accessing and processing of large data on high-performance computing infrastructures.
GNU General Public License v3.0
70 stars 43 forks source link

Optimize local DataProvider bindmounts for apptainer tasks #1394

Open prioux opened 2 months ago

prioux commented 2 months ago

When tasks are run in apptainer, the container's launch command is extended with a bunch of -B options to mount all the local data providers found on the target system. This means sometimes, we mount folders that are not even necessary for the job.

An example of that, as seen in the .science.TOOL script, was this long line:

   -B /mnt/nfs/mainstore/MainStore
   -B /mnt/nfs/mainstore/NeuroHubStore
   -B /mnt/nfs/neurohub/CONP-Datasets/BigBrain_3DClassifiedVolumes
   -B /mnt/nfs/neurohub/CONP-Datasets/BigBrain_3DSurfaces
   -B /mnt/nfs/neurohub/CONP-Datasets/BigBrain_Flat
   -B /mnt/nfs/neurohub/CONP-Datasets/preventad-open
   -B /mnt/nfs/neurohub/CONP-Datasets/preventad-open-bids/BIDS_dataset
   -B /mnt/nfs/neurohub/CONP-Datasets/visual-working-memory
   -B /mnt/nfs/neurohub/UKBB-Civet-Datasets
   -B /mnt/nfs/sftp/proftpd-1
   -B /mnt/nfs/sftp/proftpd-2

(reformatted on multiple lines for clarity).

A better mechanism would be to only mount a folder for a DP if there is at least one input file for the task on that particular DP.

We could also consider mounting these local DPs in "read-only" mode. Tasks aren't supposed to modify any of their input files, generally. There is a flag at the ToolConfig level that indicates this anyway, so the value of the flag coudl be used to determine if we mount them read-only.