Allow task developer to provide their own Python environment instead of setting one up

jluethi commented 2 weeks ago

Flow:

Task developer has a task, builds their own conda env on the cluster/locally
Their package is not installable via pip
The task developer provides the Python path to their Python (e.g. in a conda env or somewhere else) & the package whl/package manifest / thing to be defined
Fractal server reads through the manifest, prepares the database entries etc., but uses the user-provided Python

jluethi commented 2 weeks ago

As discussed at our call today :)

tcompa commented 1 week ago

The task developer provides the Python path to their Python (e.g. in a conda env or somewhere else) & the package whl/package manifest / thing to be defined

Let's define this better.

Right now we have a POST /api/v1/task/collect/pip/ endpoint, and we may add e.g. a POST /api/v1/task/collect/custom/ endpoint (which only performs the manifest-parsing and DB updates, while using the user-provided Python interpreter). Which inputs would it need?

Within the backend, three arguments are needed: python_interpreter, package_root, and manifest. Here is how we could expose them in the endpoint

python_interpreter: the absolute path which will be prepended to the task command (i.e. the /some/python to be used in /some/python /some/task.py). This is clearly a required argument.
manifest: We should decide here how we expect the user to provide this information:
- Option 2A: the user provides the whole manifest JSON object (this can be e.g. by uploading a file from their local machine through the client, and by including the JSON object in the API request body)
- Option 2B: the user provides the absolute path to a manifest file that the backend can read (either directly or through SSH)
- Option 2C: the user provides a wheel file, and the backend extracts the manifest from it
- Option 2D: the user only provides the package name, and we look for the manifest in a standard path (based on package_root, defined as in point 4 below). Non-blocking drawback: there's likely going to be a bit of confusion related to canonicalization of package-folder names (e.g. fractal-tasks-core vs fractal_tasks_core).
Feedback request: which option sounds more natural from the user point of view? Backend-side, they are roughly equivalent.
package_root: This would look like /tmp/venv/lib/python3.10/site-packages/fractal_tasks_core, and it's the root folder where the relevant package is installed. Depending on our choice above, this may also be used to look for a manifest file.
- Option 4A: The user provides it
- Option 4B: We extract via e.g. /some/python -c 'import fractal_tasks_core; from pathlib import Path; print(Path(fractal_tasks_core.__file__).parent)'. Main drawback: this requires importing the package from within the backend - which is a blocking issue, IMO.
- Option 4C: We extract it from /some/python -m pip show fractal-tasks-core. Drawback: would it play well with a conda-installed package? I tried to conda-install fractal-tasks-core in a conda env and pip show fractal-tasks-core works fine, but I cannot guarantee it will always work for any kind of env and install methods.
Feedback request: the previous point, I propose we use option 4C as a default (failing loudly if it doesn't work because of unexpected envs which don't play well with pip), while still exposing an optional input argument as in 4A (so that a user may overcome issues in option 4C). For the record, I guess option 4C would work most of the times, for venv or conda envs.

jluethi commented 1 week ago

Great summary.

Re package_root questions: If we don't need the package in a given folder on the user side (where the user then may be changing it later and be surprised that their task breaks), I'd be strongly in favor of that. This has already lead to some confusion with custom tasks (we want to expose this functionality in some areas, but not in regular task collection I'd say). 4a thus should be a fallback where we warn the user about not changing the files in that base anymore.

=> I like 4C as a default! And 4a as fallback

Re manifest: If we can make the "upload the manifest json file" work, that would be the best user flow in my opinion.

Let's not go down the route of 2d, that gets very messy with many users, more nuanced access etc.

=> Let's aim for 2a. I'd be ok with 2B where the user just provides the full path as it would be on the cluster. But that clearly is more error prone & needs more explanations to make it work.

jluethi commented 1 week ago

Thinking about it makes me wonder about changes to the environment though.

The trade-off: Nice if I can change my env during development. But gets messy if users keep changing an environment later & then tasks break.

Given that this is meant more for experimental work and stable work should eventually get a way the server collects the whole environment, I'd be fine with it. But we should add some warning about this in the interface.

tcompa commented 1 week ago

Thanks for feedback!

Thinking about it makes me wonder about changes to the environment though. [...]

Fully agreed. Any task collected in this way is only meant to be as stable as a single task added by hand would be: if you mess with the environment, you may break things.

The advice is the same as for custom users' tasks: when you get them in a satisfactory state, you should package them up and re-install from scratch (in a folder that you are not planning to modify). Even better, you should run through the fractal-server automated venv+pip route. If something is blocking with pip install, then let's debug these cases and find what are the actual issues and workarounds - so that we'll have a better informed opinion on whether we should include other options on top of venv+pip.

jluethi commented 1 week ago

Great summary. Fully agreed with that strategy!

Let's add a little warning in the interface to make users aware of this & then proceed like you describe above.

Warning in the custom Python task collection:

Collecting tasks with a custom Python environment will use that environment for running the tasks. Be careful about changing this environment, as that may break existing workflows. It is recommended to use custom Python environments only during task development or when something needed for your environment building isn't supported in Fractal server yet. Collect the task with regular Fractal task collection for production setups.

tcompa commented 6 days ago

To recap, this would be the endpoint flow:

Handle request body and prepare all arguments for the internal _prepare_tasks_metadata function
Run _prepare_tasks_metadata
Check that there is no source overlap with existing tasks in the DB
Run _insert_tasks

Note that we don't need the "background task + state" logic here, since this endpoint should not take long.

Adding to the list of request body arguments above, we should also include other arguments which are there for the add-a-single-custom-task endpoint: an optional version and a required source.

(cc @ychiucco)

tcompa commented 5 days ago

Re package_root questions: If we don't need the package in a given folder on the user side (where the user then may be changing it later and be surprised that their task breaks), I'd be strongly in favor of that. This has already lead to some confusion with custom tasks (we want to expose this functionality in some areas, but not in regular task collection I'd say). 4a thus should be a fallback where we warn the user about not changing the files in that base anymore.

=> I like 4C as a default! And 4a as fallback

Going with 4C (automatic discovery of package_root) requires

The package name (to fill the pip show package_name command)
The possibility to run pip. This is fine for non-SSH setup, but when using SSH we would need to make SSH calls from within the endpoint (and for now we are trying to stick to SSH calls in background tasks).

Let's start with this procedure:

if package_root is provided, use it
if package_root is not provided, but package_name is provided and we are not using SSH, then infer package_root
if package_root is not provided, and package_name is not provided, fail
if package_root is not provided, and package_name is provided, and we are in SSH mode, fail

fractal-analytics-platform / fractal-server

Allow task developer to provide their own Python environment instead of setting one up #1581