dchaley / deepcell-imaging

Tools & guidance to scale DeepCell imaging on Google Cloud Batch
8 stars 2 forks source link

Increase boot disk size based on input size #305

Closed dchaley closed 2 weeks ago

dchaley commented 1 month ago

The steps use local storage for their local copy of inputs, and to write outputs before sending to cloud. This uses the boot disk. Users need to tweak the boot disk size by hand in the json if the image is too big for the default (10gb I think?).

Let's update the job runner to increase boot disk size based on input pixels. Need to figure out how much. I think we have 4 float64s per pixel for predictions, and less for the other steps.

dchaley commented 4 weeks ago

Docs: https://cloud.google.com/batch/docs/reference/rest/v1/projects.locations.jobs#computeresource

dchaley commented 2 weeks ago

PR #326 set the wrong parameter: a per-task disk size. If that value were 10gb, and we had 10 tasks, it would provision at least 100gb.

We actually want to add a single disk of a given size, no matter how many tasks there are. This is ok because we clean up the temporary files along the way.

Instead, we should add a AttachedDisk to the InstancePolicy. Then we need a Volume that refers to the disk by its device name. And then we need to tell the application to use that volume's mount directory as temporary using the TMPDIR environment variable.

dchaley commented 2 weeks ago

Let's close out the DeepCell Segmentation portion of this. For QuPath measurements, see: https://github.com/dchaley/qupath-project-initializer/issues/40