AFM-SPM / TopoStats

An AFM image analysis program to batch process data and obtain statistics from images
https://afm-spm.github.io/TopoStats/
GNU Lesser General Public License v3.0
60 stars 11 forks source link

Remodel entry points and Command Line Interface to use "Swiss Army knife" approach #517

Open ns-rse opened 1 year ago

ns-rse commented 1 year ago

I think it would be useful to revise the Command Line Interface (CLI) entry point to TopoStats. Currently there are two run_topostats and toposum but in order to make this extensible I feel we should adopt what is termed the "Swiss Army knife" approach to Command Line Interfaces.

This is what programmes such as git, pre-commit and many others use. They have a single command for invocation followed by a sub-command. Using pre-commit as an example (as its written in Python and provides a good pattern to emulate)...

The main pre-commit command has the following help...

❱ pre-commit --help
usage: pre-commit [-h] [-V]
                  {autoupdate,clean,gc,init-templatedir,install,install-hooks,migrate-config,run,sample-config,try-repo,uninstall,validate-config,validate-manifest,help,hook-impl}
                  ...

positional arguments:
  {autoupdate,clean,gc,init-templatedir,install,install-hooks,migrate-config,run,sample-config,try-repo,uninstall,validate-config,validate-manifest,help,hook-impl}
    autoupdate          Auto-update pre-commit config to the latest repos' versions.
    clean               Clean out pre-commit files.
    gc                  Clean unused cached repos.
    init-templatedir    Install hook script in a directory intended for use with `git config init.templateDir`.
    install             Install the pre-commit script.
    install-hooks       Install hook environments for all environments in the config file. You may find `pre-commit install --install-hooks` more useful.
    migrate-config      Migrate list configuration to new map configuration.
    run                 Run hooks.
    sample-config       Produce a sample .pre-commit-config.yaml file
    try-repo            Try the hooks in a repository, useful for developing new hooks.
    uninstall           Uninstall the pre-commit script.
    validate-config     Validate .pre-commit-config.yaml files
    validate-manifest   Validate .pre-commit-hooks.yaml files
    help                Show help for a specific command.

options:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit

That is the positional arguments are sub-commands, if you want to run then you have the following options to that...

❱ pre-commit run --help
usage: pre-commit run [-h] [--color {auto,always,never}] [-c CONFIG] [--verbose] [--all-files | --files [FILES ...]] [--show-diff-on-failure]
                      [--hook-stage {commit,merge-commit,prepare-commit-msg,commit-msg,post-commit,manual,post-checkout,push,post-merge,post-rewrite}]
                      [--remote-branch REMOTE_BRANCH] [--local-branch LOCAL_BRANCH] [--from-ref FROM_REF] [--to-ref TO_REF] [--commit-msg-filename COMMIT_MSG_FILENAME]
                      [--prepare-commit-message-source PREPARE_COMMIT_MESSAGE_SOURCE] [--commit-object-name COMMIT_OBJECT_NAME] [--remote-name REMOTE_NAME]
                      [--remote-url REMOTE_URL] [--checkout-type CHECKOUT_TYPE] [--is-squash-merge IS_SQUASH_MERGE] [--rewrite-command REWRITE_COMMAND]
                      [hook]

positional arguments:
  hook                  A single hook-id to run

options:
  -h, --help            show this help message and exit
  --color {auto,always,never}
                        Whether to use color in output. Defaults to `auto`.
  -c CONFIG, --config CONFIG
                        Path to alternate config file
  --verbose, -v
  --all-files, -a       Run on all the files in the repo.
  --files [FILES ...]   Specific filenames to run hooks on.
  --show-diff-on-failure
                        When hooks fail, run `git diff` directly afterward.
  --hook-stage {commit,merge-commit,prepare-commit-msg,commit-msg,post-commit,manual,post-checkout,push,post-merge,post-rewrite}
                        The stage during which the hook is fired. One of commit, merge-commit, prepare-commit-msg, commit-msg, post-commit, manual, post-checkout, push, post-
                        merge, post-rewrite
  --remote-branch REMOTE_BRANCH
                        Remote branch ref used by `git push`.
  --local-branch LOCAL_BRANCH
                        Local branch ref used by `git push`.
  --from-ref FROM_REF, --source FROM_REF, -s FROM_REF
                        (for usage with `--to-ref`) -- this option represents the original ref in a `from_ref...to_ref` diff expression. For `pre-push` hooks, this represents the
                        branch you are pushing to. For `post-checkout` hooks, this represents the branch that was previously checked out.
  --to-ref TO_REF, --origin TO_REF, -o TO_REF
                        (for usage with `--from-ref`) -- this option represents the destination ref in a `from_ref...to_ref` diff expression. For `pre-push` hooks, this
                        represents the branch being pushed. For `post-checkout` hooks, this represents the branch that is now checked out.
  --commit-msg-filename COMMIT_MSG_FILENAME
                        Filename to check when running during `commit-msg`
  --prepare-commit-message-source PREPARE_COMMIT_MESSAGE_SOURCE
                        Source of the commit message (typically the second argument to .git/hooks/prepare-commit-msg)
  --commit-object-name COMMIT_OBJECT_NAME
                        Commit object name (typically the third argument to .git/hooks/prepare-commit-msg)
  --remote-name REMOTE_NAME
                        Remote name used by `git push`.
  --remote-url REMOTE_URL
                        Remote url used by `git push`.
  --checkout-type CHECKOUT_TYPE
                        Indicates whether the checkout was a branch checkout (changing branches, flag=1) or a file checkout (retrieving a file from the index, flag=0).
  --is-squash-merge IS_SQUASH_MERGE
                        During a post-merge hook, indicates whether the merge was a squash merge
  --rewrite-command REWRITE_COMMAND
                        During a post-rewrite hook, specifies the command that invoked the rewrite

I envisage replacing existing commands with the following (see also table below for further thoughts/details)...

Current Proposed
run_topostats topostats process
toposum topostats summarise
run_topostats --create-config-file topostats config

Implementation

Following the example of pre-commit this would entail introducing a topostats/main.py module to provide an entry point of topostats.main:main.

topostats/main.py then imports the various commands from a multitude of sub-modules under topostats/commands/*.py (there is one for each command).

Each arguments are defined for each sub-command within main.py in what may be an Abstract Factory design pattern (not quite sure on this front yet!).

Additional Changes

In addition all documentation (README.md, docs/usage.md etc.) would also require updating to reflect these changes.

Modules to Add

The ground work for this has been set thanks to @SylviaWhittle work in #540. We now need to further modularise the CLI with individual steps corresponding to each class as these are the way in which we delineate the processing steps in the code to run the following processing, each step saving the results for subsequent steps to be used courtesy of #613 which introduced io.save_topostats_file() to save the current state to a hdf5 file.

ns-rse commented 1 year ago

Documenting possible structure/options

Command Option(s) Description
config --copy Make a straight copy of topostats/default_config.yaml, this will include the field descriptors and satisfy #536
--create Loads topostats/default_config.yaml and updates any options with those specified on the command line, e.g. --output ~/somewhere/else would updated the output value. This would lose the field descriptors requested in #253
--plotting-dictionary Generate a sample plotting dictionary from topostats/plotting_dictionary.yaml.
--file File to write output to (default would be sample_config.yaml for default_config.yaml variants or plotting_config.yaml if --plotting-dictionary is requested.
process <config_options> Run topostats modifying the topostats/default_config.yaml with any specified command line options.
filter <config_options> Run just the filtering stage of processing.
grains <config_options> Run grain detection on filtered NumPy arrays.
grain_stats <config_options> Run grain statsitics calculations on grain detected NumPy arrays.
dnatracing <config_options> Run tracing on detected grains.
curvature <config_options> Calculate curvature from traced NumPy arrays.
summarise <config_options> Run summary plot generation along with specific options.
ns-rse commented 1 year ago

Related issues

ns-rse commented 1 year ago

Remember to ensure basename is derived for grainstats_df as well as tracing_stats_df.

ns-rse commented 11 months ago

Re-opening to undertake further modularisation as per table.

Individual issues created to address each step in the processing and this issue will serve as an Epic and be closed when each is completed.