ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
203 stars 131 forks source link

🐛 Bug: BentoML artifacts can create issues with model Docker image because of caching #1051

Closed DhanshreeA closed 3 months ago

DhanshreeA commented 6 months ago

Describe the bug.

With the introduction of multi stage builds, we keep only the necessary artifacts from a builder base image in the final model image. One of these artifacts is a bentoml database that bentoml uses to track model bundles. Because of Docker's caching mechanism, it can happen that docker has cached the layer which copies bentoml in the model image. This cached step can reference an older bundle which may not be the bundle that actually exists within the file system of the image. This makes bentoml not be able to serve the model. The larger effect is that since ersilia serve command within the container keeps retrying to reach the model server which is not running, which makes the docker container not start, which propagates to the ersilia serve command outside of the container being stuck as well.

Describe the steps to reproduce the behavior

No response

Expected behavior.

I suppose one of the possible solutions is to check during image build that the bundle name in the bentoml database is the same as the bundle that exists on the filesystem, and update it if it's not.

Screenshots.

No response

Operating environment

NA

Additional context

No response

miquelduranfrigola commented 6 months ago

Thanks @DhanshreeA - anything you need from me immediately?

DhanshreeA commented 6 months ago

Not right now @miquelduranfrigola , thanks!

DhanshreeA commented 3 months ago

This has not been reproducible. Closing as not planned for now.