Open abhinavDhulipala opened 1 month ago
A potential solutions here in the interim is to use dagster pipes and the k8s pipe client as described here
I think this is a big missing feature which is really important for complex K8s workloads.
I want to be able to specify from config:
matchLabels
, node affinity, etc) - very useful for large backfills And also dynamically set these values based on previous ops outputs.
What's the use case?
Howdy, I think a really useful feature would be to dynamically set a downstream op k8s config based on the output of an upstream or or at least through a op run config. That way we could at least launch other jobs from the output of a current job.
I have 2 particular use cases that I am having a tough time using dagster to implement.
Scenario 1:
I have an op that based on op configs builds a particular docker container. This container has built software dependencies that we use to run experiments. Think A/B comparison of a particular piece of software or a regression analysis. I.E how performant is our code today compared to yesterday. I want my users to be able to create a docker container that installs a particular version of our compiler and use that images to compile and run a suite of programs.
Here is an example run config:
A block Diagram for clarity
Scenario 2: Variable resource hinting Similar to the above scenario, given an experiment with different shapes and sizes, we can imagine that we'd want more memory for larger and larger matrix sizes.
Ideas of implementation
The ideal solution would be to add this as Output/DynamicOutput metadata:
Another solution thats nearly as good, but would be sufficient in the interim is being able to set tags during op config.
Additional information
The current workaround is independently building the images as a separate job, then launching a run job with modified execution paramaters. This is a really poor solution and only works for trivially small op enumerations.
I don't think the graphql api allows setting execution parameters (could be wrong there), but this means that we need a manual step here.
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization. I would be willing to contribute here if the maintainers express interest in potentially upstreaming this and agreeing on a implementation.