NRLMMD-GEOIPS / geoips

Main Geolocated Information Processing System code base with basic functionality enabled.
https://nrlmmd-geoips.github.io/geoips/
Other
13 stars 10 forks source link

CLI Stage Two #455

Closed evrose54 closed 3 weeks ago

evrose54 commented 3 months ago

Requested Update

Description

With the new PR #444, which creates the initial framework and functionality for the GeoIPS CLI, we are in a state to continue to develop the CLI to perform more complex processes. Currently, the CLI supports geoips [config, get, list, run, test, validate] commands, however we'd like to modify these to expand their uses cases. This issue wants to refactor geoips run and geoips list to support a greater swath of actions that they'll be able to perform.

Background and Motivation

We envision the geoips run command to actively call GeoIPS' run_procflow command, which executes process workflows with the provided arguments. Currently, geoips run is called via geoips run <script_name> <-p> <package_name>. While this works, it's not how we want it to proceed for the foreseeable future. This issue requests that we modify the functionality of geoips run to actively call run_procflow instead.

We don't want to replace the functionality of geoips list, rather we want to expand on it to allow users to select what type of output they want sent to the terminal. Depending on the subcommand stemming from geoips list, you'll get a generated list of data with static column headers. We should modify the geoips list command to support additional flags such as --column [header1, header2, ...] so that users can specify exactly what they'd like outputted. We'd also like to support flags such as --short or --long, which will correspond to header lists that will be outputted appropriately.

Code to demonstrate issue

Checklist for Completion

evrose54 commented 3 months ago

For geoips run we additionally need to consider arguments that are used by other procflow plugins such as data_fusion and config_based. While single_source and config_based consider the same arguments, this might not always be the case. geoips.commandline.run_procflow.py:main's signature allows for a specific get_commandline_args_func to be specified. If this function is specified, we will add arguments to an argparse.ArgumentParser object which will take on arguments specific to that current procflow. Considering we don't know what procflow is being ran until we've parsed the ArgumentParser's Namespace, we have a few options.

1:

2:

One more issue after this PR to finish up updates as needed

evrose54 commented 3 months ago

Currently, I have GeoipsRun (Command Class) implemented as the following, which works as run_procflow would previously be called. Please let me know if you have any opinions over which option you would prefer (the checks are done natively by argparse for duplicate arguments right now).

"""GeoIPS CLI "run" command.

Runs the appropriate script based on the args provided.
"""

from geoips.commandline.args import add_args
from geoips.commandline.run_procflow import main
from geoips.commandline.geoips_command import GeoipsExecutableCommand
from data_fusion.commandline.args import add_args as data_fusion_add_args

class GeoipsRun(GeoipsExecutableCommand):
    """Run Sub-Command for running process-workflows (procflows)."""

    subcommand_name = "run"
    subcommand_classes = []

    def add_arguments(self):
        """Add arguments to the run-subparser for the Run Command."""
        add_args(parser=self.subcommand_parser)
        data_fusion_add_args(parser=self.subcommand_parser)

    def __call__(self, args):
        """Run the provided GeoIPS command.

        Parameters
        ----------
        args: Namespace()
            - The argument namespace to parse through
        """
        main(ARGS=args)