dchaley / deepcell-imaging

Tools & guidance to scale DeepCell imaging on Google Cloud Batch
8 stars 2 forks source link

Accept indexed task args #299

Closed dchaley closed 2 months ago

dchaley commented 2 months ago

To support running on multiple inputs in one job, we need the task runners to understand $BATCH_TASK_INDEX: they use this to determine which of a list of inputs they're operating upon.

The bulk of the work is updating how the scripts take their command-line arguments: either a file, or, individual arguments as usual.

This introduces pydantic as a way to define the script parameters aka API. The idea is to reduce duplicated data like which parameters are required or what their defaults are. (Some more work needed here like generating the arguments from the type def, and moving the help text into the type too.)

This adds a helper to fetch the task list.

Relates to #292

Most of this paired with @WeihaoGe1009

REVIEW NOTE: consider ignoring whitespace to see that the args are (mostly) just indented.