ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
203 stars 131 forks source link

[Epic] Scheduled Prediction Runs #281

Closed D1M1TR10S closed 1 year ago

D1M1TR10S commented 2 years ago

Summary ⭐️

Run scheduled prediction runs for all models in Ersilia's database. Analyzing patterns on prediction runs helps their research. Build an Actions workflow that triggers prediction runs for all models in the database. Make sure it's performant and scalable – and less likely to incur heavy consumption costs on AWS.

Objective 🎯

We want to answer the question: how long does it take each model to run a prediction for each molecule? (starting with the most common molecules). This way we have metrics around prediction times and other data points related to the performance of models.

Supporting Details ✏️

Dependencies πŸ–‡

βœ… Tasks

M1: Gather requirement/prototyping/Spikes (2 Weeks)

Team πŸ€

DRI: @honeyankit Backup: @megamanics

Timeline πŸ—“

Documentation πŸ““

miquelduranfrigola commented 1 year ago

Hi @honeyankit, I have implemented a sample command. Responding now to #302 to document it!

D1M1TR10S commented 1 year ago

There’s a bug in the Scheduled Prediction Runs Workflow. Usually happens when there’s a YAML syntax issue.

D1M1TR10S commented 1 year ago

Update

The cron job is done. Picks a model randomly and runs every night.

Can we specify the model ID as a parameter? Then we can test a model of choice at any time. Should be easy to implement so we’re working on adding that to the workflow. @GrantBirki @Lehcar

honeyankit commented 1 year ago

Can we specify the model ID as a parameter?

@D1M1TR10S : This PR has taken care of adding model Id as a input to predict ersillia model input https://github.com/ersilia-os/ersilia/pull/476

The cron job is done. Picks a model randomly and runs every night.

This is not done and is pending.

miquelduranfrigola commented 1 year ago

Thanks @honeyankit - apologies, it might have been me who have the wrong update in the meeting. I thought the cron action was done but not active.

D1M1TR10S commented 1 year ago

Pretty much done. DynamoDB integration – credentials to submit results into DynamoDB. We have a library to do that. Just need to bundle that in Ersilia. Just add a parameter that says "save" or "don't save".

Once DynamoDB is done we want to test it in the Actions workflow.

GemmaTuron commented 1 year ago

@miquelduranfrigola will activate it as a cron job to test it is working despite the results won't be uploaded to DynamoDB @Lehcar and @honeyankit were working on a parallelization option

honeyankit commented 1 year ago

@GemmaTuron If I remember correctly, the parallelization work is already completed.

honeyankit commented 1 year ago

Adding corn job and testing the integration of Dynnamo DB + Ersilia with Prediction run is left. Estimated time to complete: ~1 week

D1M1TR10S commented 1 year ago

Status of integration with DynamoDB: hitting a wall. We have the library already to do so. Haven't been able to do so in Ersilia yet. Contractor hasn't completed this step yet. Once the integration is complete, just add the cron job to trigger the tests.

@miquelduranfrigola will follow up and complete this on his own, once DynamoDB is ready. All we need is the cron job.

Two workflows:

Cron job duration interval: Counting installation of Ersilia. Between 10-20 minutes. => Every 15 minutes, trigger testing workflow.

@honeyankit will close this out once done – by Monday (2/6) at the latest

honeyankit commented 1 year ago

Cron job duration interval of 15 minutes to the Ersilia prediction schedule run is added.

honeyankit commented 1 year ago

The last action item ( Test the predict action workflow when Dynamo DB is api feature is integrated in Ersillia model. ) will be completed by Ersilia as discussed in our last meeting: