ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
224 stars 147 forks source link

🐕 Batch: Clean up Ersilia CLI #1262

Open DhanshreeA opened 2 months ago

DhanshreeA commented 2 months ago

Summary

Ersilia CLI presently has several commands and some of them are either not being used, are redundant (ie their functionality is a subset of another command), or are outdated. This issue tracks each individual command how the roadmap for it.

Command Intended Function Working Status Roadmap
api Runs a given API, currently only works as run. Cannot run tracking because tracking is currently only implemented within code executed during the run command. Outdated Useful when we have the functionality to have more APIs in ersilia, specifically when we want models to be trainable. Presently it can be de-registered from the cli.
auth Logs a user into GitHub to switch between developer vs user mode. A wrappper around GitHub Auth API. WIP To be used when Ersilia implements user management
card Gets card information for a given model Functional Redundant, and should be merged with catalog as a flag.
catalog Print's Ersilia's Model Catalog Functional Incorporate card as a flag
clear Removes eos and bentoml folders, and conda environments. Not complete since it doesn't remove images and containers technically. Outdated 1. Rename to uninstall 2. Add functionality to remove ersilia itself. 3. This can potentially use the delete --all functionality within Ersilia to also remove model images.
close Closes a served model Functional NA
current Returns the currently served model's id. Subset functionality of 'info'. Functional Outdated To be removed
delete Deletes a given model Functional We need an --all flag to remove all models.
example Generates example input for a served model. We added a predefined flag which first looks for the presence of an example input file. Functional Add flag to sample model ids at random as well, merging 'sample' command.
fetch Fetches a model Functional NA
info Prints information for a current model Functional
inspect Inspects a model repository for completeness with respect to dependencies and files, and checks for extraneous files. It also determines model performance by running it against an increasing number of inputs. MVP 1. This needs to be tested with the new eos-template 2. It has overlapping functionality with the test command and therefore it can potentially be merged with test.
run Runs a model Functional NA
sample Generates both a sample of inputs and models Functional Should be merged with example
serve Serves an ersilia model Functional --track flag is MVP. It needs to be implemented with models run through conda.
test Test an ersilia model MVP 1. Needs more testing 2. It needs to be validated against models built with the new template

There is also apparently a developer command called deploy which is not registered in the CLI. There appears to be no roadmap to use this especially since it's tied to Heroku (which we are not using in our infra at present). @miquelduranfrigola thoughts?

Objective(s)

Documentation

No response

GemmaTuron commented 2 weeks ago

Hi @DhanshreeA Can you tick from the list what has been completed and update the issue?

DhanshreeA commented 2 weeks ago

Hi @GemmaTuron: This the present status of the CLI commands. In writing and testing this again, I realized there are still some lose ends to close on this front. I'll create issues for them.

Command Flags/Options Description
auth Wrapper around GitHub login. Using this assumes that the user is an ersilia contributor.
catalog -f/ --file_name Write the catalog to a file
catalog --browser Opens the link to Airtable maintained catalog; same as when both --hub and --browser are set.
catalog --more/--less Print more or less information about the catalog. When less information is requested, only the identifier is printed. These flags work with both local and hub settings, however more is very slow with hub catalog.
catalog --card Prints the card for a given model id. Model does not need to be available locally.
catalog --as-table Prints the catalog in an ASCII table. Works with both local and hub flags.
catalog -l, --local/--hub Prints the catalog of models available locally on the user's system, and models available in the hub.
close Closes the model running in the shell from which this command is executed. Does not close models running in other shells.
delete Deletes the model specified with its eos id.
delete --all Deletes all the models available on the user's system
example -n, --n_samples Generates specified input examples for a given model. The model can be specified via its eos id or is inferred from the current session if a model is already served.
example -f, --file_name Specify the file to write generated examples.
example -s, --simple/ -c, --complete Specify whether to generate only SMILES or other fields such as InchiKeys
example -p, --predefined/ -r, --random Specify whether to sample from a predefined set of examples in the model's repository or from maintained inputs.
fetch -r, repo_path Fetch a locally available model by providing its repository path
fetch -m, --mode Packing mode, usually either one of conda or docker. Not used or actively maintained.
fetch --dockerize/ --not-dockerize Whether to dockerize a model or not. Also not used. Not used or actively maintained.
fetch --overwrite / --reuse Whether to overwrite or reuse a model's conda environment. This is helpful when the model is fetched from source.
fetch --from_github Whether to fetch a model from its GitHub repository. This fetches the model from source.
fetch --from_dockerhub Whether to fetch a model using its Docker image maintained in Ersilia's Container Registry
fetch --from_s3 Whether to fetch a model from its source stored in Ersilia's S3 buckets.
fetch --from_hosted Whether to fetch a model available as a web service and hosted at the given URL. The user provides this URL.
fetch --with_bentoml This is a contributor flag, and should be used while fetching a model from source. This decides whether to use the legacy BentoML bundling strategy for a given model.
fetch --with_fastapi This is a contributor flag, and should be used while fetching a model from source. This decides whether to use the upgraded FastAPI bundling strategy for a given model.
info Dumps the information, usually a model card or metadata, for a served model. The command should be run in the same shell.
inspect Inspects a model specified using its eos id.
run Runs the model served in the session in which this command is run.
run -i, --input Specify the input for the model. This can be free text input, as well as the file path to an data file.
run -o, --output Specify the output file to store the model predictions.
run -b, --batch_size Specify the batch size for the input data. This is the batch size with which queries are made to the model server. Default batch size is 100.
run --standard Whether a given run is standard or not. For a standard run, Ersilia does not perform sanity checks on the input thereby making it faster.
serve --output-source Whether to get outputs from the locally served model, or from a cloud based precalculation store.
serve --lake/ --no-lake Whether to use a precalculation lake or not. This is deprecated and no longer actively maintained.
serve --docker/ --no-docker Whether to serve a model using Docker or not. This doesn't really work.
serve -p, --port Specify the port on which to serve a model
serve -t, --track Specify whether to enable monitoring for the served model. All model runs will be tracked as long as the model is served. These results will be stored in Ersilia's S3 if the user has correct AWS permissions.
test Test a given model and obtain its performance metrics.
test -o, --output Serialize the results of the test in the given output file.
uninstall Uninstall all ersilia related artifacts and models, as well as ersilia itself.
miquelduranfrigola commented 1 week ago

Thanks @DhanshreeA this is useful.

GemmaTuron commented 3 days ago

@DhanshreeA for the card command now, how does it work if I want to pass a model id?

ersilia catalog --card eosxxxx

DhanshreeA commented 1 day ago

@GemmaTuron exactly. For example, this is what I get with ersilia catalog --card eos3b5e (currently not available on my system):

{
    "Identifier": "eos3b5e",
    "Input": [
        "Compound"
    ],
    "Mode": "Pretrained",
    "GitHub": "https://github.com/ersilia-os/eos3b5e",
    "Publication": "https://www.rdkit.org/docs/RDKit_Book.html",
    "Source Code": "https://github.com/rdkit/rdkit",
    "License": "BSD-3.0",
    "Output": [
        "Other value"
    ],
    "Description": "The model is simply an implementation of the function Descriptors.MolWt of the chemoinformatics package RDKIT. It takes as input a small molecule (SMILES) and calculates its molecular weight in g/mol.\n",
    "Status": "Ready",
    "Slug": "molecular-weight",
    "Title": "Molecular weight",
    "Tag": [
        "Molecular weight"
    ],
    "Input Shape": "Single",
    "Interpretation": "Calculated molecular weight (g/mol)",
    "Task": [
        "Regression"
    ],
    "Contributor": "miquelduranfrigola",
    "Output Shape": "Single",
    "Output Type": [
        "Float"
    ],
    "DockerHub": "https://hub.docker.com/r/ersiliaos/eos3b5e",
    "Docker Architecture": [
        "AMD64",
        "ARM64"
    ],
    "S3": "https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos3b5e.zip",
    "Runtime": [
        "CPU"
    ],
    "Deployment": "Local",
    "Repository": {
        "label": "GitHub",
        "url": "https://github.com/ersilia-os/eos3b5e"
    },
    "Code": "$ ersilia serve molecular-weight\n$ ersilia api -i 'CCCOCCC'\n$ ersilia close",
    "Calculation": "https://github.com/miquelduranfrigola",
    "Incorporation Date": "2021-09-13",
    "Incorporation Quarter": "Q3",
    "Incorporation Year": "2021"
}