Currently, if the run is preempted, it has to restart from scratch. Womp, womp. This is particularly problematic for files that take hours to process: it took 12.5 hours to ingest & measure segmentation for a 140M pixel image.
It seems that most of the time is spent loading & decompressing the input tiff over and over again. Perhaps fixing that will make all this moot.
If not, we need resumable runs: skip all cells (regions of interest) that have already been measured. This implies check-pointing progress to the cloud, as today, we use temporary instance storage.
Currently, if the run is preempted, it has to restart from scratch. Womp, womp. This is particularly problematic for files that take hours to process: it took 12.5 hours to ingest & measure segmentation for a 140M pixel image.
It seems that most of the time is spent loading & decompressing the input tiff over and over again. Perhaps fixing that will make all this moot.
If not, we need resumable runs: skip all cells (regions of interest) that have already been measured. This implies check-pointing progress to the cloud, as today, we use temporary instance storage.