Closed ranchodeluxe closed 7 months ago
If the above solution is too gross (which I understand) another option is to write a bunch of validation (most of it just Flink specific) that checks things like:
Bake.start
(like we already do for pangeo-forge-recipes
but for everything not explicitly in pyproject.toml
deps)apache-beam
installed on the host matches their FlinkOperatorBakery.container image
passedfsspec
? Maybe it should just exist in our pyproject.toml
and be default installed?I had this issue recently. Here's how you can easily get into a messed up state:
conda create env -n pangeo_test -c conda-forge pangeo-forge-recipes
pip install "pangeo-forge-runner[extras]"
pangeo-forge-runner
has an optional dependency of apache-beam[gcp]
which last exists in v2.42.0
. pip
will install that version over the latest version that pangeo-forge-recipes
installs from conda-forge
(v2.51.0
at time of writing). When you run a FlinkDeployment using the runner, it'll correctly build the FlinkDeployment with whatever container_image
you specify, but the pipeline.run()
call will fail because apache-beam
will default to the old version v2.42.0
which isn't compatible with the latest Flink version.
I think it's important here to either pass down a specific jar in the pipeline options or validate that the apache-beam version installed is similar to what is being configured in the FlinkDeployment.
Sorry for the delayed response here.
I certainly agree that client/runner version mismatch is a huge headache and we should do something about that.
To understand, is this issue in a way a version of https://github.com/pangeo-forge/pangeo-forge-runner/issues/27, in which @yuvipanda proposed having a separate fetcher/environment provisioner?
To understand, is this issue in a way a version of https://github.com/pangeo-forge/pangeo-forge-runner/issues/27, in which @yuvipanda proposed having a separate fetcher/environment provisioner?
No, this example diff isn't intended to be a solution in that direction. But I guess the question is that ticket (which seems to be over a year old and referring to the old "orchestrator" idea) still the vision? First time seeing it
But I guess the question is that ticket (which seems to be over a year old and referring to the old "orchestrator" idea) still the vision?
This is a good point for discussion. Regarding references to the "orchestrator", I think we can view #27 as if that does not refer to pangeo-forge/pangeo-forge-orchestrator, and rather means just "little o" orchestrator, as in the person or whom/whatever is calling pangeo-forge-runner
to deploy jobs.
One idea mentioned in #27 which definitely feels worthwhile to me, is that if we are going to approach automatic environment modification in any form, that should definitely be happening in temporary venvs of some sort, as is (IIUC) done in pre-commit
, tox
, nox
, etc.
Over in the GitHub Action, I do a version of what your example diff is proposing:
Which feels appropriate to me because the Action is only ever called within the context of an ephemeral GitHub Action container, so we are not going to be modifying (human) user environments on-the-fly with that code.
In pangeo-forge-runner
itself, though, modifying the same environment that the CLI is invoked in feels innappropriate, and could conceivably lead to even more confusion, insofar as users environments are being changed underneath them.
I would be curious to understand more about how pre-commit
, tox
, nox
etc. manage ephemeral venvs, and if there is something we can borrow from any of those projects on this topic.
In pangeo-forge-runner itself, though, modifying the same environment that the CLI is invoked in feels innappropriate, and could conceivably lead to even more confusion, insofar as users environments are being changed underneath them.
I would be curious to understand more about how pre-commit, tox, nox etc. manage ephemeral venvs, and if there is something we can borrow from any of those projects on this topic.
Agreed, let me start looking into options down this pathway
YESSSS I THINK STEALING WHAT pre-commit
DOES IS EXACTLY THE RIGHT WAY TO GO HERE
fucking amazing
Problem
I apologize for the range of this question b/c I got here through some Flink runner frustrations. For the runner we currently have a bunch of unlisted runtime dependencies that need to be installed to execute
pangeo-forge-runner bake
command:Possible Solution
AFAIK for
pangeo-forge-recipes>=0.10.0
we require arequirements.txt
to be in the feedstock. So I want to build on (#130) and ask if there any reason we can't do something automated (but slightly gross) like this really basic example diff?Thoughts?