ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
224 stars 147 forks source link

Make fetch fail if standardrun doesnt work #1386

Closed DhanshreeA closed 3 days ago

DhanshreeA commented 3 days ago

Logs below:

00:16:09 | DEBUG    | Activation done
00:16:09 | DEBUG    | Packing command successfully run inside eos8ub5 conda environment
00:16:09 | DEBUG    | Creating model symlink bundle > dest
00:16:09 | DEBUG    | Creating symlink from /home/dee/eos/repository/eos8ub5/20241119-9eadd098-3169-4cac-8e85-697055a510a0/model
00:16:09 | DEBUG    | Creating symlink to /home/dee/eos/dest/eos8ub5/model
00:16:09 | INFO     | Could not create symbolic link from /home/dee/eos/dest/eos8ub5/data.h5 to /home/dee/eos/isaura/lake/eos8ub5_public.h5
00:16:09 | DEBUG    | Symlinks created
00:16:09 | DEBUG    | Getting model card of eos8ub5
00:16:09 | DEBUG    | Trying to get metadata from: /home/dee/eos/dest/eos8ub5
00:16:09 | DEBUG    | Card saved at /home/dee/eos/dest/eos8ub5/card.json
00:16:09 | DEBUG    | Saving slug chemical-space-projections-coconut
00:16:09 | DEBUG    | Checking that autoservice works
00:16:09 | DEBUG    | Setting BentoML AutoService for eos8ub5
00:16:09 | DEBUG    | No service class provided, deciding automatically
00:16:09 | DEBUG    | No service class file exists in /home/dee/eos/repository/eos8ub5/20241119-9eadd098-3169-4cac-8e85-697055a510a0/service_class.txt
00:16:09 | DEBUG    | Pack method is: fastapi
00:16:09 | DEBUG    | Pack method is: fastapi
00:16:09 | DEBUG    | Setting virtual environment at /home/dee/eos/dest/eos8ub5
00:16:09 | DEBUG    | Pack method is: fastapi
00:16:10 | DEBUG    | Pack method is: fastapi
00:16:10 | DEBUG    | Service class: conda
00:16:10 | DEBUG    | Getting APIs from list file
00:16:10 | DEBUG    | Getting APIs from FastAPI
00:16:10 | DEBUG    | Sniffing model
00:16:10 | DEBUG    | Getting model size
00:16:10 | DEBUG    | Model size is 4736.359976768494 MB
00:16:10 | DEBUG    | Fetching eos8ub5 done in time: 0:00:28.725089s
00:16:10 | INFO     | Fetching eos8ub5 done successfully: 0:00:28.725089
00:16:10 | DEBUG    | Running standard CSV example
00:16:10 | DEBUG    | /home/dee/eos/dest/eos8ub5/example_standard_input.csv
00:16:10 | DEBUG    | /home/dee/eos/dest/eos8ub5/example_standard_output.csv
00:16:10 | DEBUG    | Usage: ersilia [OPTIONS] COMMAND [ARGS]...

  🦠 Welcome to Ersilia! πŸ’Š

Options:
  --version      Show the version and exit.
  -v, --verbose  Show logging on terminal when running commands.
  -s, --silent   Do not echo any progress message.
  --help         Show this message and exit.

Commands:
  auth       Log in to ersilia to enter contributor mode.
  catalog    List a catalog of models
  close      Close model
  delete     Delete model from local computer
  example    Generate input examples for the model of interest
  fetch      Fetch model from Ersilia Model Hub
  info       Get model information
  inspect    Inspect model
  run        Run a served model
  sample     Sample inputs and model identifiers
  serve      Serve model
  test       Test a model
  uninstall  Uninstall ersilia

00:16:10 | DEBUG    | No need to use Conda!
πŸš€ Serving model eos8ub5: chemical-space-projections-coconut

   URL: http://0.0.0.0:55307
   PID: 59444
   SRV: conda
   Output source: local-only

πŸ‘‰ To run model:
   - run

πŸ’ Information:
   - info
00:16:15 | ERROR    | Ersilia exception class:
StandardModelExampleError

Detailed error:
Standard model run from CSV was not possible for model eos8ub5
Output file /home/dee/eos/dest/eos8ub5/example_standard_output.csv was not created successfully

Hints:
If you fetch this model from Docker Hub, or you are running it through URL, this is the first time run is executed in your local computer. Reach out to Ersilia to get specific help.

🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨

Error message:

Ersilia exception class:
StandardModelExampleError

Detailed error:
Standard model run from CSV was not possible for model eos8ub5
Output file /home/dee/eos/dest/eos8ub5/example_standard_output.csv was not created successfully

Hints:
If you fetch this model from Docker Hub, or you are running it through URL, this is the first time run is executed in your local computer. Reach out to Ersilia to get specific help.

If this error message is not helpful, open an issue at:
 - https://github.com/ersilia-os/ersilia
Or feel free to reach out to us at:
 - hello[at]ersilia.io

If you haven't, try to run your command in verbose mode (-v in the CLI)
 - You will find the console log file in: /home/dee/eos/current.log
00:16:15 | DEBUG    | Standard model example failed, deleting artifacts
Do you want to delete the model artifacts? [Y/n]
00:16:20 | INFO     | Starting delete of model eos8ub5
00:16:20 | INFO     | Removing EOS folder /home/dee/eos/dest/eos8ub5
00:16:21 | INFO     | Removing bundle folder /home/dee/eos/repository/eos8ub5
00:16:21 | DEBUG    | Folder removed
00:16:21 | DEBUG    | Attempting Bento delete
00:16:21 | DEBUG    | Attempting temporary folder delete
00:16:21 | DEBUG    | Attempting lake delete (local)
00:16:21 | DEBUG    | Deleting /home/dee/eos/isaura/lake/eos8ub5_local.h5
00:16:21 | DEBUG    | Attempting lake delete (public)
00:16:21 | DEBUG    | Deleting /home/dee/eos/isaura/lake/eos8ub5_public.h5
00:16:22 | INFO     | Removing docker images and stopping containers related to eos8ub5
Total reclaimed space: 0B
00:16:22 | DEBUG    | Running docker images > /tmp/ersilia-jasxcrf1/docker-images.txt
00:16:22 | DEBUG    | Model entry eos8ub5 was not available in the fetched models registry
00:16:22 | SUCCESS  | Model eos8ub5 deleted successfully
πŸ‘Ž Model eos8ub5 failed to fetch!

This PR attempts to fix the issue where ersilia behaves as if a given model got fetched successfully even if the Standard Example run for that model failed. This behavior has downstream effects in terms of making ersilia catalog pick up this model among locally available models. Because these model artifacts remain present on the system, the model can also be served. Through this PR, if the standard run fails, we delete all model artifacts, ie the model dest and model bundle, thereby remaining true to the fact that the model fetching failed.

miquelduranfrigola commented 3 days ago

Did not check the PR in detail but the functionality makes a lot of sense. Thanks