Closed miquelduranfrigola closed 1 year ago
In the publish
module, there is now a class to update models to S3Buckets: https://github.com/ersilia-os/ersilia/blob/master/ersilia/publish/s3.py
It seems to work smoothly. You can download, for example, the README.md
of model eos3b5e: https://ersilia-models.s3.eu-central-1.amazonaws.com/eos3b5e/README.md
Next step is to incorporate this in a GitHub workflow, possibly to be triggered when a model passes all PR tests and is merged.
I have created the Pull Request for functionality that fetches LFS files from S3 bucket first, and if they're not available falls back to LFS only for files not found in the S3: https://github.com/ersilia-os/ersilia/pull/573 One caveat - I removed _clone_with_gh and _clone_with_pygit2 from ersilia/utils/download.py to not allow bypassing the S3 check. I tested it myself and seem to be fine, but it would be good to double-check in an environment with gh CLI and pygit.
Hi @ttokarczuk ! I have already merged your PR. I've made some minor changes (mainly relative paths) since for some reason logger was failing on a brand new installation. I think the _clone_with_git works just fine!
In addition, please note that I have created a GitHub Action to upload models to our S3 bucket: https://github.com/ersilia-os/eos-template/blob/main/.github/workflows/upload-model-to-s3.yml
For now, the workflow is only activated on workflow_dispatch
(i.e. manually). The idea is that this will happen automatically as soon as a model is fully validated by our team.
This seems to work nicely. Therefore, I close this issue.
@ttokarczuk has been working on a functionality that first checks whether a model is available as an S3 bucket. If it is available, it checks that the checksum of the LFS files is the same than that of the file in S3, and preferentially downloads from S3. This should help reducing costs associated with download from LFS.
I have created an
ersilia-models
bucket in S3. The URL of a given model will be: https://ersilia-models.s3.eu-central-1.amazonaws.com/ followed by the model identifier (e.g. eos4e40)Next step is to create a function to automatically upload models into this S3 bucket.