Scheduling Improvements

bcwaldon commented 10 years ago

UPDATE 9/24:

remove note about unfair scheduling as offering/bidding mechanism is gone
remove note about supporting memory-based scheduling

There are two major aspects of scheduling for fleet to focus on: resource scheduling and dependency scheduling.

As far as resource scheduling goes, fleet is not going to have a full-featured scheduler. We have no plans to support any resource-related parameters past the leveling of the number of units scheduled to a particular machine.

Dependency scheduling, however, is incredibly important to get right. The following are the currently-supported parameters:

MachineID bypasses the scheduler altogether and places a unit directly on a machine -MachineMetadata filters the list of possible machines to which a unit can be scheduled using key-value metadata
MachineOf provides affinity, scheduling a unit to the same machine as another unit
Conflicts provides anti-affinity, scheduling a unit to a different machine than any units that match a glob pattern

There are several ideas for new dependency-scheduling behaviors, which are enumerated below:

MachineOf should support multiple arguments (https://github.com/coreos/fleet/issues/727)
MachineOf should support wildcards (https://github.com/coreos/fleet/issues/494)
schedule a machine to a specific host, but never reschedule it (https://github.com/coreos/fleet/issues/667)
Requires/Wants Before/After similar to official systemd Requires/Wants/Before/After, but behave at the cluster-level (https://github.com/coreos/fleet/issues/464)

stuart-warren commented 10 years ago

As far as resource scheduling goes, fleet is not going to have a full-featured scheduler.

Does CoreOS intend to support other schedulers?

Past the leveling of the number of units across the cluster, fleet will only take into account memory limits.

So it will put the same number units onto each server? What if I have a few different specs of servers, some massively more powerful than others? Can I set some bias in the fleet config perhaps?

bcwaldon commented 10 years ago

@stuart-warren We definitely intend to provide a full solution here, we're just not going to make fleet solve everyone's scheduling problems.

The fleet scheduler supports metadata-based filtering, and the memory scheduling will be relative to the available memory of each machine independently.

dbason commented 10 years ago

@bcwaldon so what would be considered a full solution? Will we have the ability to weight units (if they require relatively more cpu than other units), or will this be something we need to implement outside of Fleet?

bcwaldon commented 10 years ago

At this time, we have no plans to support any resource-related parameters. If this is something you care about, you should explore something like kubernetes or mesos.

gust1n commented 10 years ago

I totally get your point about keeping the scheduler simple and instead let others bud more high level tools to solve that. But what about some simple spreading of resources? We're using templates to support simple heroku-like scaling of processes. And we would rather not use the conflict fleet param to spread the jobs since we then have to set limits. But very often if we scale a job to, say 3, they all end up on the same host. And what is worse is that often all jobs of all services end up on the same host. This gives us a scenario where 1 host is under heavy load and the 2 others are not used.

Are there any plans for simple spreading of jobs across a cluster?

bcwaldon commented 10 years ago

@gust1n The current scheduler distributes units based on the current number of units scheduled to each host. Are you not not experiencing this?

We've also started a discussion around how fleet can support external schedulers over here: https://github.com/coreos/fleet/issues/922. If you have any input, I'd appreciate it greatly.

gust1n commented 10 years ago

@bcwaldon Unfortunately (on stable) most of the time almost all jobs (except those with X-Fleet logic) ends up on the same machine. On a machine restart they all migrate to the next one. Don't know if what you're describing is not in the stable channel yet? What you described was what my request was all about, something simple that spreads the jobs. I solved it for now by using some X-Fleet conditions anyways.

bcwaldon commented 10 years ago

@gust1n yes, by "current scheduler" I mean fleet v0.7.0+. The stable channel will be updated soon.

jonboulle commented 9 years ago

Cross-post: see https://github.com/coreos/fleet/issues/922#issuecomment-65868802

coreos / fleet

Scheduling Improvements #747