additional CLI wrapper (on top of HTTP/flask app)

keighrim commented 3 months ago

New Feature Summary

Running a CLAMS as a stateless HTTP app makes app invocation and data transfer clean and standardized, and most importantly, language-agnostic interface. However it comes with cost of computational overhead of running server processes and handling TCP communication.

At the moment, all CLAMS apps developed in the team is written under Python, with the support of SDK, so to provide an additional interface to users maybe for quicker test/debug, we'd like to include a wrapper for unix-shell command line interface, just like we are providing flask wrapper to turn the app code into a HTTP web app.

Basic idea;

When a developer uses clams develop to start a new app, it generates
- app.py: main implementation of the App class
- metadata.py: defines app metadata and runtime parameters
- (new) cli.py: a completely unmanned interpreter of metadata.py + app.py to provide a standard POSIX CLI
users can use app.py to spin up a webapp and call the app with URL plus query string for runtime parameters, or directly call cli.py with shell arguments for runtime parameters
input MMIF and output MMIF should just use stdin and stdout respectively.

Some random notes;

CLI implementation is not language-agnostic, hence cli.py can be a recommended implementation, but shouldn't be a requirement for CLAMS app specification.
as a python program (probably based on arg_parser module), there will be some basic expectation for some flags (--help, --version, etc) but since we don't put any constraints for app developers to pick a parameter name (as long as it is URL-compatible), those basic flags should be disabled (an app can take version= runtime parameter, for example).
as an equivalent to GET request, users can just run metadata.py to get JSON-formatted app metadata.

Will add more as we discuss more details.

Alternatives

No response

Additional context

No response

MrSqually commented 2 months ago

A basic implementation (with map support) is available on this branch of SWT. Next steps are pulling out the local imports (it currently gets the metadata by importing SWT's metadata.py). The last big challenge I can see for this is invoking the app constructor in a way that's agnostic to the app details (for instance, SWT uses a "preconf" file to do some initial configuration). creating an interface to facilitate this is doable, but we should discuss implementation before I attempt it.

keighrim commented 2 months ago

Both points are very valid. Regarding the first issue (imports in metadata.py), I had to change the app registration GHA to install the full requirements.txt to handle import yaml line in the SWT's metadata.py. (https://github.com/clamsproject/apps/commit/e8d7be3fbfe381f5fb6a32ad83fa5ce658c1d170) For cli.py though, it's not a bad idea to install requirements.txt anyway...?

The second question raises more bigger question, I think. Should we disallow developers to change the signature of App.__init__() method? I don't think that's a good idea. Without prohibiting customized constructors, we can apply a single-like pattern in the app template. For example; from https://github.com/clamsproject/clams-python/blob/c96c55ed24fbc9d1284724cf5db38a058e0004ec/clams/develop/templates/app/app.py.template#L42-L63

to


# this signature never changes
def get_app():
    # but developers can change constructor signature and customize this line
    return $APP_CLASS_NAME(init, params, from, global, variables)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--port", action="store", default="5000", help="set port to listen")
    parser.add_argument("--production", action="store_true", help="run gunicorn server")
    # add more arguments as needed
    # parser.add_argument(more_arg...)

    parsed_args = parser.parse_args()

    ## SET SOME global variable to use in the get_app()
    # create the app instance
    app = get_app()

    http_app = Restifier(app, port=int(parsed_args.port))
    # for running the application in production mode
    if parsed_args.production:
        http_app.serve_production()
    # development mode
    else:
        app.logger.setLevel(logging.DEBUG)
        http_app.run()

then in cli.py

# do prep 
app = app.get_app()
# continue with app instance

This means all existing apps need to be modified to adopt this pattern, but they need to be done so anyway to add cli.py. However, I'm more worried about the part where global variables are involved, which usually results in some unexpected sideeffects. Any thoughts?

MrSqually commented 2 months ago

regarding the first issue - just noticed that the AppMetadata is defined in the app.__init__ as a set of parameters with defaults, but without instantiating the values to be used - this means I can just access app.metadata after instantiation and pass them to the parser this way. I'm pretty sure this solves the coupling between metadata.py and cli.py that I was struggling with.

for the second issue - I think that using a factory method makes sense there, if you don't see any other issues with it I think that's probably the best move going forward. The global variable issue is interesting, could you give an example of an app that has "unexpected side effects"? I'm much more inclined to say "disallow problematic globals for most contexts" instead of "disallow modification of App.__init__ signature" but I'm not sure what kinds of problems global vars are causing in practice so I don't know if that's reasonable (one problem I can see is that the if __name__ == "__main__" is usually in global scope).

keighrim commented 2 months ago

The SWT app is already using a global variable to store the default config file name anyway. Probably I was worrying too much about the global variables in my previous comment. Using the singleton-like pattern actually adds additional abstract on the app class name, so the cli.py doesn't even have to know the class name to initialize. So I guess that's a benefit.

ps. I tried to understand the experimental cli.py code in the swt repo (https://github.com/clamsproject/app-swt-detection/blob/test_cli_interface/cli.py) and it doesn't run at all. @MrSqually Can you take a look at it?

MrSqually commented 2 months ago

So, the actual "run the app" portion of the code is commented out because I was print-debugging the argument parser (see main). Is that the issue?

keighrim commented 3 weeks ago

closed via #217

clamsproject / clams-python