camsas / firmament

The Firmament cluster scheduling platform
Apache License 2.0
415 stars 79 forks source link

short description about your work #44

Open cxxly opened 8 years ago

cxxly commented 8 years ago

I have read your Phd thesis, it is brilliant. But it is too difficulty to understand all of things , is there some concise description about your work that everyone can understand easily. I think that wil make the project more attractive.

ms705 commented 8 years ago

Hi @cxxly,

Thanks! We definitely agree that reading the PhD thesis is not the best way to find out how Firmament works ;-)

We've been working on a shorter paper about Firmament, which is currently under anonymous submission. As soon as we know what outcome of its review, we will make it publicly available -- however, you can shoot us an email at firmament@camsas.org if you would like a private copy of the draft.

More generally, we are planning to work on three things over the summer that will help make the project more accessible:

  1. An extended technical report that describes how Firmament works, what design decisions we made, and how one implements new scheduling policies for it.
  2. A series of blog posts on the Firmament blog, which will explain our work in an accessible way, including demos that you can try yourself.
  3. Integration of Firmament with existing cluster management systems, so that it becomes easier to deploy in a production-ready environment. We already have a prototype for Kubernetes, and are looking at others, although we're a bit resource-constrained in terms of developing and maintaining multiple integrations.

In the meantime, here are some resources that explain Firmament, but which are currently maybe somewhat hard to find:

I definitely agree that we need to do a better job at explaining how to use Firmament as we transition from an academic research project to something more widely used. If you have questions about particular aspects, or ideas as to what we should prioritize, please do let us know!

cxxly commented 8 years ago

Thanks for your reply @ms705

Currently, I do some research based on your work, but focus on container scheduling. Here are some problems confused me:

  1. What is features? What is policies? and what is cost model? And What is the relation between them.
  2. What is the boundary of different features and how they combine. Currently, user who want to use firmament must be understand and implement complicated cost model, and those cost model may support same features, is there any different? Why not just make user simply select some features and firmament combine them.
  3. is there any different to use firmament in container scheduling, what changes we need to do.

I agree to your plan to make the project more accessible, here are some extra proposal:

Now, I'm doing some work on simplifying it in container scheduling and integration with swarmkit :) !

ms705 commented 8 years ago

Hi @cxxly,

Sorry for the delayed response! Awesome to hear that you're working on a SwarmKit integration -- we're more than happy to help with that if you get stuck.

Here are the answers to your questions:

  1. In Firmament, a cost model defines a scheduling policy by defining the structure of the flow network which Firmament optimizes and by assigning costs and capacities to arcs. Firmament is designed to be extensible with pluggable cost models (i.e., the core implementation does not make any assumptions about what your cost model -- and thus, your scheduling policy -- is). Currently, cost models are implemented as C++ classes that implement the CostModelInterface interface. One useful research contribution would be a domain specific language for defining cost models that (i) makes it easier to define them, and (ii) allows cost models to be plugged in without having to recompile Firmament.
  2. I think you're asking about whether Firmament could automatically "compose" different features (e.g., priority preemption, multi-dimensional resource requirements, interference awareness) a custom cost model? This is an interesting avenue for future research, but is not trivially possible: some features are contradictory and could lead to unexpected interactions. For example, priority preemption and interference awareness might keep moving a task between machines unless the cost model specifies which of them takes priority. The best way I can think of supporting this would be to have a fixed list of features that the user must rank in some kind of priority order (which specifies how to break ties). This should be possible to implement, but investigating it is a small research project of its own.
  3. Using Firmament for container scheduling is not substantially different. In fact, we have done so in two instances: our Poseidon Kubernetes plug-in uses Firmament to place containers in Kubernetes, and we have a fork that uses LXC to start containers from the standalone Firmament cluster manager code. The main challenges with containers are to adapt the health monitoring and statistics collection code to target containers rather than processes. In the Kubernetes integration, we simply rely on Kubernetes to do all of this and only use the scheduling logic from Firmament.

Hope that helps! Let us know if you have any other questions.

cxxly commented 8 years ago

Hi @ms705 Thanks. I have found coco cost model support so many features which include priority preemption and interference awareness. I want to konw how dose it avoid conflicts of these features ? I cann't understand clearly from your thesis.

ms705 commented 8 years ago

Hi @cxxly,

In the CoCo cost model, priority takes precedence over interference avoidance, since it forms the dominant term of the cost vector (see p. 149, bottom and here). In other words, a task will be priority preempted by a higher-priority task even if it ends up re-scheduling in a place where it suffers interference.

(In my example above, I was referring to the fact that it's difficult to automatically combine features because you need to make a call on which one takes precedence in situations like this. CoCo defines the precedence order.)

Does that make sense?