hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
189 stars 58 forks source link

prototype of adding a level of argument grouping to ConfigBuilder #474

Closed kzuberihmf closed 11 months ago

kzuberihmf commented 11 months ago

Not for merging

This is a small prototype of adding argument grouping into our CLI support, intended for design review. A simple implementation turned out to be straightforward to add.

For ACTIN-273, we want to disallow invalid combinations of arguments to orange, which is complicated by the different modes of operation (targeted vs whole-genome etc). Many CLI parsers support sub-commands, this example adds a very simple subcommand argument grouping mechanism to ConfigBuilder.

The unit test gives an example. Subcommands are optional - current ConfigBuilder users not affected. When subcommands are configured, each gets its own set of argument specifications, and global level arguments will be shared across all sub-commands. When configured, the first argument to the application must be a valid subcommand. Multiple sub-commands in a single invocation are not allowed.

Something like the following:

# run myapp in cmd1 mode that requires -arg1 and -arg2. 
$ myapp cmd1 -arg1 val1 -arg2 val2 -global_arg global_val
# run myapp with cmd2 that requires -arg3.
$ myapp cmd2 -arg3 val1 -global_arg global_val

Does this seem like enough flexibility to cover the needs of ACTIN-273, while pushing argument consistency verification into ConfigBuilder and not having a bunch of custom verification in the application code?

kzuberihmf commented 11 months ago

An alternative to adding subcommands to the parser would be to have multiple entry points to orange, each with a set of CLI args tailored to a specific type of operation. This would have the advantage of not requiring a more complex argument parser. Thanks to @jbartletthmf for the suggestion.

kduyvesteyn commented 11 months ago

Somewhat related to @jbartletthmf suggestion, maybe a "lighter" version would be to have separate blocks of config that internally enforce consistency. RNAConfig is an example of this. This class leads to the situation where Orange either has no RNAConfig at all, or a completely filled in.

Maybe similar to this, we could add "WGSConfig" (to hold all configs that are run in WGS but not targeted) and "RefConfig" (to hold all configs to be present when a ref sample is present

kzuberihmf commented 11 months ago

Thanks for the discussion, closing this PR.