Closed mfischer-zd closed 2 years ago
Hi everyone - thank you for the patience.
We are working on implementing native task dependencies now and are exploring a potential Airflow integration.
Would love support in adding feedback + your interest in this ticket to the Apache Airflow committee so they may understand the demand. Ideally, we'd like to optimize the experience by providing a first-class integration, rather than a maintained fork.
https://issues.apache.org/jira/browse/AIRFLOW-5633
cc @jazzyfresh
@yishan-lin a Nomad executor for Airflow would be absolutely brilliant.
in watching, and expect
I faced the similar issue in our deployments, so I created a tool. https://github.com/sagarrakshe/nomad-dtree
We needed this enough that we implemented it ourselves. We have an AST for nomad jobs and interpret it to figure out which consul health checks to watch, wait for their success/fail timeout, and add the unblocked jobs to the work queue.
Agreed, we could not wait, either.
We ended up writing a DAG parser to evaluate eligibility of a node based on complex boolean dependencies, only exposing eligible nodes to Nomad for scheduling.
Not ideal, since we are now reliant on a single-threaded process for scheduling, but we are able to schedule several thousand jobs per minute this way. This might pay off in the long term, since it is unlikely Nomad's dependency roadmap includes boolean/complex dependencies.
Hey all, for those that missed our Nomad Virtual Day livestream last week - task dependencies is coming in Nomad 0.11, which folks will hear more about it in the coming weeks.
Here is a recording of the wonderful demo and presentation for reference that @jazzyfresh did on the feature - https://www.hashicorp.com/resources/preview-of-nomad-0-11-task-dependencies
For more complex dependencies as @recursionbane mentioned, we are targeting an integration with Apache Airflow to support such functionality.
That’s great news. @jazzyfresh I have a question related to this issue: I presume if we wanted to have a database server up and running before the main task, we would declare it as a pre-start, sidecar task in Nomad v0.11. Does the new lifecycle-hook mechanism observe the Consul health-check of the database service before moving on with the main lifecycle phase? Or would we need to leverage Apache AirFlow for this?
@yishan-lin that's awesome! Prestart and Poststop hooks are definitely not just a nice-to-have, and i'm super happy that you added them.
However, i don't think that those hooks count as "task dependencies". Consider a group with 5 containers, one that needs to run before (prestart), one that needs to run after (poststop), and the other three containers need to be brought up in sequence. Prestart and poststop partition the scheduling space into 3 chunks, not N chunks like a true "task dependencies" addition would.
An example of this is how we bring up ZK/Kafka in our software (we run them on nomad with host volumes). We have to submit two different jobs since there's no way to have "generic" task dependencies, so we're forced to wait until ZK's health check comes back before submitting the kafka job. True task dependencies would allow us to coalesce them into one job.
Hey Dhash - you and I synced on this offline but recapping it here for visibility for all. The 5 container group example you mentioned is the kind of DAG functionality that I'd look for our Apache Airflow integration to cover, which is on our roadmap and coming soon!
Our use case has been worked around well by the use of consul_service_health and nomad_job in terraform.
We now use terraform to submit all our nomad jobs, and the wait_for parameter in the consul_service_health allows the data dependency to the next nomad job to not be fulfilled until all checks are passing
Hey @yishan-lin, I was just curious if there are any updates on the Airflow integration? We would love to see a Nomad executor!
Hi, does anyone here have experience using Nomad for scheduling Airflow tasks (or vice-versa)? I am looking to constrain resources of individual tasks within an Airflow DAG by isolating them with cgroups and namespaces provided by Nomad's exec driver. Any help, resources, or advice would be so very much appreciated! Thank you, all.
Interested in that as well
Any update on this?
Nomad lifecycle hooks have shipped for a while now. There's an open issue still for cross-job dependencies https://github.com/hashicorp/nomad/issues/545 that covers the other use cases described here. That's a bigger project and one we've had some discussions about, but it's not on our immediate roadmap either.
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Tasks in a group sometimes need to be ordered to start up correctly.
For example, to support the Ambassador pattern, proxy containers (P[n]) used for outbound request routing by a dependent application may be started only after the dependent application (A) is started. This is because Docker needs to know the name of A to configure shared-container networking when launching P[n].
In the first approximation of the solution, ordering can be simple, e.g., by having the task list in a group be an array.