hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.87k stars 1.95k forks source link

Ability to have nodes which can be used only for job with specific constraint. #2299

Closed cyrilgdn closed 1 year ago

cyrilgdn commented 7 years ago

Reference: https://groups.google.com/forum/#!topic/nomad-tool/Nmv8LiMUnEg

It would be great to have a way to avoid jobs to be run on a node unless they specify a constraint!

Quoted from the mailing list discussion:

Here is our (simplified) case:

We have 3 servers A, B and C and we want on C only specific jobs.

Currently we use class on nomad nodes for that: A: class = "foo" B: class = "foo" C: class = "bar"

All ours jobs specify constraints like: constraint { attribute = "${node.class}" value = "foo" # Or "bar" if must be deployed on C }

It works but it's a bit constraining as there's only a few jobs which must be run on C but we have to put constraints on all. Furthermore, we want (in the near future) our developers to be able to write their own jobs, and don't want these jobs to be deployed on the wrong server if they forgot the constraint.

Thanks !

dvusboy commented 7 years ago

Why do you need to put node.class constraints on all your Job/TaskGroup? That means a Job/TaskGroup can only run on "bar" or "foo", never either. Is that your intention?

cyrilgdn commented 7 years ago

@dvusboy We want no jobs to be run on C except with a specified constraint.

Currently, we can tell to some jobs to go specifically on C with the constraint "${node.class}" = "bar" but all the jobs without any constraint could be run on A, B or C. We want them to be run only on A or B, so we have to put the constraint "${node.class}" = "foo" on them.

dvusboy commented 7 years ago

Or ${node.class} != "bar". It sounds like you want something as a config at the Nomad client level, restricting it to only allow tasks with the constraint ${node.class} == "bar"

cyrilgdn commented 7 years ago

@dvusboy That's what I want indeed.

momania commented 7 years ago

This would be a very helpful feature. 👍

cetex commented 6 years ago

I agree, this would be a very nice feature.

We have a redundant nomad setup, but a few nodes are highly special (different network config and similar, basically bridging our environment to other environments)

If we could setup a default constraint on those special nodes (only jobs with class "bridge-node" for example) we could make sure that no one deploys anything on those nodes unless they really mean to.

Ideally this should also be protected through vault integration so we could limit who is allowed to deploy to these network-bridging-nodes.

The only solution right now seems to be to setup a dedicated consul / nomad cluster for these nodes to make it hard to do mistakes.

adragoset commented 6 years ago

This would be ideal. Here's what I've run into. I've got a handful of specialized build servers in my cluster. These servers are really big machines 16+ core and a ton of ram additionally they have a specific set of hardware which supports a specfic set of cpu instructions that I'm using to build and test a suite of different applications optimized for machines with these instruction sets. Ideally id like for only jobs tagged with a very specific constraint to deploy onto those machines. I cant trust that the other developers i give access to deploy jobs through deployment tools that interface with nomad to deploy tasks are always going to add a constraint to their jobs so that they dont get scheduled onto my specialized machines. Something has to be there so that nodes can enforce a set of node specific rules on which jobs will get scheduled to them when job constraints arent defined in a job definition.

pznamensky commented 6 years ago

Would be very useful for us too. We have tons of .nomad files and we wouldn't like to modify them all. Would be great to add some constraint/flag on client side.

sandromehic commented 5 years ago

any news on this feature? Seems like there are different good use cases. I'm interested in how other people are solving this on their clusters.

schmichael commented 5 years ago

This is the first I've read this request and it is interesting to me Nomad doesn't really have a way to disable placements on a node by default except for jobs which explicitly target it.

You could use datacenters today to achieve this. If your normal datacenter is dc1, you could put the disable-by-default nodes in dc1-foo so that only job's that explicitly state they want to be schedule in datacenter dc1-foo actually end up on those nodes.

It's a little hacky but may be easier than trying to enforce proper constraints on every job.

cyrilgdn commented 5 years ago

@schmichael I like this workaround, thanks for the idea!

pznamensky commented 5 years ago

Looks like in v0.9 you could manage jobs placement via affinities: https://www.nomadproject.io/guides/advanced-scheduling/affinity.html

preetapan commented 5 years ago

Affinities is in 0.9 beta and will be in the final release coming soon, closing this

pznamensky commented 5 years ago

@preetapan, I think we should reopen the issue. Affinities can't manage placement from the point of nomad client view. So affinities behave here like more flexible constraints. But if you would like to have nodes for an only particular type of jobs you still have to edit all nomad files in a cluster and add affinities/anti-affinities/constraints to nomad files to avoid jobs to be placed on that nodes. It's is quite hard to do in clusters with hundreds or thousands of jobs.

eigengrau commented 5 years ago

I would also opt to have this re-opened, since affinities are soft. If I understood correctly, this issue is about a hard, cluster-wide, implied constraint.

pznamensky commented 4 years ago

@preetapan @schmichael what do you think about reopenning the issue?

schmichael commented 4 years ago

How is the datacenter workaround insufficient? https://github.com/hashicorp/nomad/issues/2299#issuecomment-459745159

pznamensky commented 4 years ago

At first it just unobvious and as you notice, it's the workaround. But real problems come when you indeed use multiple datacenters. For instance, we're grouping all our alerts by datacenter. The workaround breaks this ability. Another case is about service catalog and service discovery. When our services communicate with others, they prefer local services. With this workaround, we can't do automatically service discovery, and we should add exceptions and special policies. It's is quite hard when you have more than one fake datacener.

schmichael commented 4 years ago

Agreed this should remain open. I think another compelling use case is to actually suggest people allow Nomad servers to be clients as well. Then using this feature you could ensure the servers aren't considered for the vast majority of workloads, but you could still use system jobs for log shippers and monitoring tools on the servers.

spuder commented 4 years ago

Another use case: access to restricted data.

I have a redis cluster that can only be accessed by certain services. Access to redis is controlled by iptables.

I need to have a subset of my nomad agents that have iptables rules allowing access to that redis cluster.

Option1: new datacenter

I’d rather not use the datacenter workaround because I’m already using datacenter primitive for my 2 co-located DCs. There is a 1:1 mapping between consul dc and nomad dc. Deviating from that will require training developers about the exception.

Option2: consul-connect + envoy

Theoretically Envoy would be a viable option. I could use consul Acls to restrict access to the redis cluster to certain workloads. Unfortunately redis in cluster mode requires one of the following

I’ve tried and it’s not possible to use a proxy with envoy. There is experimental support for native redis support in envoy, but it doesn’t work with the traditional redis clustering I use.

yishan-lin commented 4 years ago

Taking a look at this - thanks all for the input.

shantanugadgil commented 3 years ago

@schmichael the datacenter option sounds fine in theory, but that would mean I need a separate set of Consul server(s) for the dc-foo as well, correct?

For me, the Consul servers are also Nomad servers for my actual dc. and the use case exactly what you have mentioned here: https://github.com/hashicorp/nomad/issues/2299#issuecomment-568120233

I want to run some trivial nomad system gc, etc jobs on the Nomad servers but NOT want any random jobs landing up on the Nomad servers.

ygersie commented 3 years ago

Agree that this should really be a client option. There are many use cases where one wants to prevent jobs from being scheduled on a set of nomad clients with a certain node class. Very simple example: you may want to dedicate a set of workers for the task of running Elasticsearch nodes. You'd use a node class to pin these jobs to a set of nodes. Even though this can be accomplished using datacenter, this has other issues with features like balancing placement using spread.

stenh0use commented 2 years ago

+1 For this feature please

seanamos commented 2 years ago

Agreed this should remain open. I think another compelling use case is to actually suggest people allow Nomad servers to be clients as well. Then using this feature you could ensure the servers aren't considered for the vast majority of workloads, but you could still use system jobs for log shippers and monitoring tools on the servers.

This is our exact use case. We want to run a monitoring job on the nomad cluster servers, but no other job should ever be scheduled onto those clients.

We work around it at the moment by specifying a constraint on every job.

axsuul commented 2 years ago

My current workaround is to have my cluster nodes on a different datacenter than my member nodes and it seems to work well. datacenters is a required attribute on each job anyways so it's pretty explicit.

shantanugadgil commented 2 years ago

@axsuul Did you have to run a separate Consul DC as well? OR Did setting a different datacenter for the specific nodes "just work"?

axsuul commented 2 years ago

All my nodes are within the same Consul datacenter so they can still communicate with each other. Consul datacenters don't seem to be related to Nomad datacenters. Just to clarify, all my Consul nodes are within the dc1 datacenter while my Nomad nodes are either within a managers1 or main1 datacenter.

shantanugadgil commented 2 years ago

All my nodes are within the same Consul datacenter so they can still communicate with each other. Consul datacenters don't seem to be related to Nomad datacenters. Just to clarify, all my Consul nodes are within the dc1 datacenter while my Nomad nodes are either within a managers1 or main1 datacenter.

TIL!!! Thanks for the specifics!

I have stuck to a fixed configuration since forever and I guess I had a mental association of those config parameters!

mikevink commented 1 year ago

Just chiming in with support for this feature. I was quite surprised to realise that you can set up namespaces or flag nodes with a class, but then can't use those in ACLs and job restrictions.

Having a Nomad client only accept job from one namespace, and then adding an ACL on that namespace to restrict who can launch jobs would be very useful.

mikenomitch commented 1 year ago

Hi, just wanted to provide an update here.

We are planning to ship a feature called Node Pools in 1.6 that will address some of this! We can post a technical design doc later with more details, but the idea is that you can have an additional (and optional) attribute on nodes called "node_pool". It will work similarly to node_class, except you will have to opt in to placing jobs onto non-default node_pools.

You will also be able to tie node_pool placement to namespaces, so you can tie ACL policies to placing jobs on the pool.

This might not be exactly what was requested, but I think it's close enough in spirit to add the 1.6 milestone.

mikenomitch commented 1 year ago

Hey everybody, as I mentioned earlier, we're planning a relatively simple version of this request for 1.6. Each node can have an additional (and optional) attribute called "node_pool". It will work similarly to node_class, except you will have to opt in to placing jobs onto non-default node_pools.

One of the constraints of this approach is that each node can only be a part of a single node_pool. This works if you want to force job-writers to opt into specific pools for something like prod/dev/test, or exclude a set of nodes by default ("you must opt into the GPU node pool explicitly"), but there are some more complex use-cases this doesn't support. For instance, you couldn't have a series of "taints" and have to tolerate all of them at the jobspec level (excuse the K8s terminology 😄).

We're interested in learning more about how people would use more complex opt-in constraints, where you would have to opt-in to multiple "pre-set" constraints (with an "AND"). Perhaps there's even a case where you might have to opt into one of several constraints (an "OR")? Or instances where you might be mutating the node's constraint-set quite a bit.

If you've got a use case like this, please let us know in a comment! Also, if you feel like talking through your use case with the team, feel free to grab a time and we can chat!

mikenomitch commented 1 year ago

Not an exact match, but adding this link here so we can close this out once this is shipped: https://github.com/hashicorp/nomad/issues/11041

tgross commented 1 year ago

Shipped!