Closed daltskin closed 2 years ago
A minimal scenario of a team of analysts working collaboratively on a machine learning project would be
In the scenario:
Remarks:
@daltskin @mjbonifa my thoughts:
JupyterHub with Docker spawner is a prerequisite to the full solution, this will need OAuthenticator configured with the Workspace App registration. Can we use the local Jupyter instance on the per user VMs as a first stab to work with ML Flow?
Does the shared storage have to be a specific directory or is any mounted storage location sufficient? Either way feels like #1266 is a dependency. How does ML flow know where to look? Do we need an ML Flow directory creating and specified in advance?
We will need an ML Flow workspace service story, a shared instance is likely not appropriate. My thinking is to use a new app service in the existing app service plan for the web front end and a Postgres PaaS with a private endpoint into the workspace.
Can we use the local Jupyter instance on the per user VMs as a first stab to work with ML Flow?
Yes, simplify further to test this
Does the shared storage have to be a specific directory or is any mounted storage location sufficient? and How does ML flow know where to look?
mlflow is configured with a couple of parameters See MLFlow docs
e.g.
mlflow server --backend-store-uri sqlite:///mlruns.db --default-artifact-root /fs/mlruns
Do we need an ML Flow directory creating and specified in advance?
Yes, the storage will need to be mounted when the mlflow server starts and mlflow will need to be configured where to write artifacts
We will need an ML Flow workspace service story
The MLFlow is per workspace and not per TRE. Please elaborate what you need.
Other considerations
MLFlow has a user interface so will need consider how it is accessed via Gucamole too.
@mjbonifa that's useful, all good. Don't worry about the final two points.
As part of the ML Flow workspace service will need to create a directory within the shared storage share so users know where to write artifacts.
We have looked at the the requirements and expanded them further:
mlflow ui
. For the MLFlow tracking server it makes no difference if deployed on Linux or Windows from a analysts perspective as they access the mlflow UI through a browser or the MLflow python client API. @mjbonifa @CalMac-tns I've split out #1290 so initial work can begin - does the acceptance criteria look ok to you?
Work is complete, PR is merged.
Is your feature request related to a problem? Please describe. As a researcher I need to be able to track my ML jobs
Describe the solution you'd like To use MLflow within a workspace or shared service