wildlife-dynamics / ecoscope-workflows

An extensible task specification and compiler for local and distributed workflows.
https://ecoscope-workflows.readthedocs.org/
BSD 3-Clause "New" or "Revised" License
7 stars 4 forks source link

Allow tasks to define their own (mamba/pip) requirements #209

Open cisaacstern opened 2 months ago

cisaacstern commented 2 months ago

The current solution for testing failures arising from ecoscope core's dependency (as of writing this) on numpy <2, was #205.

Per @Yun-Wu's comment https://github.com/wildlife-dynamics/ecoscope-workflows/pull/205#discussion_r1731204480, however, I agree that the rigid pinning to a specific ecoscope core version in our CI environments is not a good long (or even mid) term plan.

A better solution would be to allow tasks to define their own requirements metadata in the @task decorator. A rough early sketch of this is provided in #207. As noted there, at least in terms of API feel (if not implementation), here's one design reference: https://modal.com/playground/custom_container.

Implementation-wise, I imagine these metadata can be used by the compiler to dynamically compose an environment.yml as a new build artifact for a given workflow compilation spec. This environment.yml would simply be the union of all requirements specifies by all tasks used in that workflow. Then, micromamba/mamba/conda could be used to build an environment for the workflow. I would imagine this would be the standard way for building environments for compiled workflows.

For running one-off tasks (as might more commonly be done for unit tests, for example), we might consider a context manager that builds an activates an emphemeral environment, something like this. So for us that could be something like:


environment = MambaEnvironment(
   dependencies=["geopandas==0.14.2", "numpy<2"],
   pip=["ecoscope>=1.8.2,<0.9.0"],
) 

@task(environment=environment)
def my_cool_task(...):
    import ecoscope

    x = ...  # some computation requiring ecoscope core

    return x

with venv(my_cool_task.environment):
     # context manager __enter__ builds an adds `my_cool_task.environment` to PATH
     my_cool_task(**kw)

# context manager __exit__ deactivates environment, so here we're back to the "outer" env

xref https://github.com/moradology/venvception/issues/5

cisaacstern commented 2 months ago

(@tonywk I have deleted a comment from you because it appeared to be spam. which is strange, because I have not seen that on GitHub Issues before, but it strongly appeared to be spam. If it was not spam, please feel free to comment again with more context.)