fractal-analytics-platform / fractal-server

Fractal backend
https://fractal-analytics-platform.github.io/fractal-server/
BSD 3-Clause "New" or "Revised" License
10 stars 3 forks source link

Define more Python-interpreter configuration variables #1575

Closed tcompa closed 2 months ago

tcompa commented 3 months ago

We should introduce a few different Python-interpreter configuration variables:

  1. FRACTAL_SLURM_WORKER_PYTHON: Path to Python interpreter that will run the jobs on the SLURM nodes. If not specified, the same interpreter that runs the server is used. For SSH mode, this should be a path on the remote cluster.
  2. FRACTAL_TASKS_PYTHON_39: Path to a Python 3.9 interpreter to be used as a base for task venvs. Note: this interpreter should remain "immutable". For SSH mode, this should be a path on the remote cluster.
  3. Same for 3.10 and 3.11..

This is to be introduced as part of the SSH work, but it'd be best if we can already make it work across all executors.

Ref

tcompa commented 3 months ago

Self-reminder: the SSH task-collection currently add a "N/A" for the Python version in the task source.

jluethi commented 3 months ago

Use full path to FRACTAL_TASKS_PYTHON_39 instead of having a user just say use Python 3.9 Step a) Get the Python interpreter, build a venv, pip install Step b) Given an environment + manifest, update the db


Custom environments Next step: Also expose option to have user specify the full Python executable path (see https://github.com/fractal-analytics-platform/fractal-server/issues/1581). If the user provides the full path to the env, they also need to provide a path to the manifest. Fractal server will not manage that environment, e.g. will not install dependencies in that environment

=> only does step b)

Scope limits: Custom environments don't come with the reproducibility guarantees (i.e. Fractal server can't recreate them)

Once we figure out whether people use conda, mamba, pixi etc. => we can make a custom version of step a) for that


end of step a) should get a pip freeze so we prepare for deleting & recreating envs reproducibly

tcompa commented 3 months ago

While developing this new feature (the use of well-defined python interpreters as base interpreters for venv creation), we realized with @mfranzon that part of the current behavior would need to change.

Right now, when a user triggers a task collection without asking for a specific Python version, this uses the same python interpreter that is running fractal-server itself. This is the reason why we keep creating new versioned fractal-server envs whenever we update fractal-server, to avoid messing with envs which were also used as a base for task envs.

With the split introduced in the current issue, this behavior won't take place any more - provided the user asks for a given Python version. E.g. the user asks for Python 3.10, and the interpreter defined in the corresponding configuration variable is used.

We plan to make the python version a required attribute within task collection.

Questions (cc @jluethi @lorenzocerrone):

  1. Should we set a default for it (e.g. `python_version="3.10") or should the user always provide one?
  2. If we set a default, do we prefer to have it set in fractal-server (e.g. a FRACTAL_DEFAULT_PYTHON_VERSION="3.10" config variable) or client-side (that is in the fractal-web UI, and in fractal-client)?
jluethi commented 3 months ago

Should we set a default for it (e.g. `python_version="3.10") or should the user always provide one?

+1 on the API eventually being called with a specific interpreter and avoiding scenarios where no Python was specified.

Whether the API should already have a default or whether it's just shown in Fractal web as a dropdown with a default selected, I have no strong opinion on.

The user flow I imagine is that most users don't care about the Python version, we have a default like 3.10 set (either in the front-end or the backend) and the user just says "install my-fractal-task-package" => gets it installed with Python 3.10.

The backend will need to be aware of some of this, e.g. have a way to specify the selectable Python 3.9-3.12 envs. I'd be fine without a backend FRACTAL_DEFAULT_PYTHON_VERSION variable, as long as it remains easy from the web interface to select the relevant Python version.