ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
198 stars 128 forks source link

[🐅 Epic]: Ersilia Model Hub Maintenance: background and first steps #977

Open miquelduranfrigola opened 5 months ago

miquelduranfrigola commented 5 months ago

Background

This is the opening issue of the project Ersilia Automation Workflow Module done in collaboration with Harvard Tech For Social Good (T4SG). The goal is to create suite of GitHub Actions workflows and a Python-based CLI testing module that run several critical monitoring and maintenance tasks for the Ersilia Model Hub.

Useful resources

To become familiar with the Ersilia Model Hub, these resources can be helpful:

First steps

To get started with the project, we need to:

Set up users.

Work on a case example on the Ersilia CLI

In the Ersilia CLI, we have a ModelTester class that we did not manage to bring to production yet. It can be found here. In my opinion, we should write an independent script in the context of this project, containing its own evaluator or tester class - the current ModelTester is too involved to get started with. If you agree, let's:

Implement an automation on the Ersilia Maintenance repository

Following this first example, we should create a GitHub Actions workflow that:

Next steps

As soon as we have finished with this first steps, we will:

  1. Decide other checks we may want to do on the models, increasing the complexity of the checks.
  2. Define a good scheduling system to inspect all models in the Ersilia Model Hub in a periodic manner.
miquelduranfrigola commented 5 months ago

The following relevant questions were raised by the @ersilia-os/harvard-t4sg. I try to respond:

What is the difference between writing a .py script and a .yml script? Does each .yml file need to import a python file?

The way I envisage this is as follows (but I am open to discussion!):

  1. The python files will reside inside the Ersilia CLI codebase. Ultimately, they can be run via the ersilia inspect command as described above.
  2. Therefore, in the .yml file of the github action, the first thing that needs to happen is that Ersilia is installed. Then, in a next step within the action, we can execute the ersilia inspect command. In sum, I see the .yml file as a workflow for running several steps (e.g. install ersilia, inspect model, etc.). In this project, we'll have to work both on .py and .yml files.

Also, do you see us cloning the repository and writing up the files in a code editor, like VSCode? Or should there be a "run" option on the github Actions page on github itself?

I would definitely work on VSCode. By consensus, worflow files are stored in .github/workflows/, so if you write them in VSCode and push changes, they will be reflected in the GitHub web interface.

Also, we create the .yml file inside of the ersilia-os/ersilia-maintenance folder correct?

Yes, you can create it in the ersilia-os/ersilia-maintenance/.github/workflows folder

anshikavashistha commented 5 months ago

May I work on this issue @miquelduranfrigola as I am a beginner in Ersilia and would love to work in this issue?

miquelduranfrigola commented 5 months ago

Hello @anshikavashistha, thanks for your message. This work is led by a team of volunteers from Harvard T4SG, so we are well covered for now! We'll let you know if we need assistance.

anshikavashistha commented 5 months ago

No problem @miquelduranfrigola Please let me know if you need any kind of assistance

miquelduranfrigola commented 5 months ago

Hello @ersilia-os/harvard-t4sg

The first version of the workflow works! https://github.com/ersilia-os/ersilia-maintenance/actions/runs/8382908764/job/22957612010

There are just a few things to finalize related to this PoC. Please see this issue for more.

DhanshreeA commented 2 weeks ago

While working with @dzumii over some improvements towards https://github.com/ersilia-os/ersilia-maintenance we observed that the model inspection code being utilised by this repository is in fact not within Ersilia, and is inside a much divergent fork from the original contributor. It is imperative that we move that code to the main ersilia repository such that it is easier to extend and maintain. At present, the fork is >35 commits ahead and about ~100 commits behind the upstream, making it harder to simply follow the pull request route. I propose simply doing a git diff and creating a patch commit to merge this within the ersilia main repository.

Post this code migration, we should evaluate whether to close this issue or keep it open for further discussion. Thoughts @miquelduranfrigola ?

DhanshreeA commented 2 weeks ago

After inspecting the contents of the diff between the two repositories, the only files we need to keep are the following:

Please make sure to remove all print calls and replace them with debug logs. @dzumii do you think you'd be able to take this up?

dzumii commented 2 weeks ago

Ok, I will check it out @DhanshreeA