NRLMMD-GEOIPS / geoips

Main Geolocated Information Processing System code base with basic functionality enabled.
https://nrlmmd-geoips.github.io/geoips/
Other
14 stars 11 forks source link

CLI Class Organization #457

Open biosafetylvl5 opened 6 months ago

biosafetylvl5 commented 6 months ago

Description

Using standard UML notation, eg: -.-.- > (dashed arrow) means "implements" ----> (solid arrow) means "inherits from"

This is the current CLI Class organization: image

This is my proposal: image

Potential Benefits of Changing Structure

Using inheritance to track the structure

This would get rid of things like this:

class GeoipsList(GeoipsCommand):

"""GeoipsList Sub-Command for listing packages/scripts/interfaces/plugins."""

subcommand_name = "list"

subcommand_classes = [

GeoipsListSingleInterface,

GeoipsListInterfaces,

GeoipsListPackages,

GeoipsListPlugins,

GeoipsListScripts,

GeoipsListTestDatasets,

GeoipsListUnitTests,

]

which are "magic lists" used to track code. Instead inheritance could be used to auto-track things.

eg the above could become:

subcommand_name = "list"
subcommand_classes = self.__subclasses__()

This would make code simpler, maintaining the codebase easier, and contributions easier for newbies.

Less code duplication

For groups of commands like List, a singular print to console function could be used to decrease code duplication.

eg. in GeoipsListUnitTests the print-to-console functionality is in __call__ and is:

       headers = ["GeoIPS Package", "Unit Test Directory", "Unit Test Name"]
        print("-" * len("Available Unit Tests"))
        print("Available Unit Tests")
        print("-" * len("Available Unit Tests"))
        print(
            tabulate(
                unit_test_info,
                headers=headers,
                tablefmt="rounded_grid",
                # maxcolwidths=self.terminal_width // len(headers),
            )

Similarly, in GeoipsListTestDatasets the print-to-console functionality is also in __call__ and is:

        headers = ["Data Host", "Dataset Name"]
        print("-" * len("Available Test Datasets"))
        print("Available Test Datasets")
        print("-" * len("Available Test Datasets"))
        print(
            tabulate(
                dataset_info,
                headers=headers,
                tablefmt="rounded_grid",
                maxcolwidths=self.terminal_width // len(headers),
            )

This code could be refactored into GeoipsList, inherited and called like this:

self.print_to_console(headers, dataset_info)

This would be advantageous because:

  1. It decrease code duplication, and bring all the associated benefits
  2. Makes it easy to change the printing functionality in the future

Code structure should mirror functionality

This is an ideological take. Having code structure mirror functionality conveys the following benefits:

  1. Easier debugging
  2. Easier/cleaner code (and resulting documentation)
  3. Maintainability - "Code that is structured logically requires less effort to maintain. When the structure mirrors functionality, it's easier to identify where changes need to be made, anticipate the impact of those changes, and implement them without introducing errors."

This would make geoips list (and similar cmds) proper executable commands

Currently, these print the help for the commands. This functionality could be maintained, but this re-org allows for future capability to be added if need be.

Potential Drawbacks of Changing Structure

Methods would be inherited

If GeoipsListTestDatasets inherits from GeoipsList, it will inherent all the methods, and all the methods that are not name mangled would be callable.

@jsolbrig is concerned about this. See https://stackoverflow.com/a/7456865 for thoughts on the issue, and benefits/drawbacks to privatization via mangling.

evrose54 commented 3 months ago

This should be doable. We can use <class>.__subclasses__() to get a list of children classes that inherit from <class>, Ie. (GeoipsList).

We'd probably end up merging GeoipsCommand and GeoipsExecutableCommand into one consolidated GeoipsCommand class. This will require some restructuring for non-executable classes which inherit from GeoipsCommand (GeoipsList), but shouldn't be too much of a hassle.

We should also consider moving some functionality from GeoipsExecutableCommand to GeoipsList and GeoipsGet if applicable.