Flexible per-job task trackers

tarnfeld commented 10 years ago

This is something i've been thinking about for a while, and i'd be interested if anyone else (@brndnmtthws @florianleibert?) has any contributions to the idea.

Currently the Hadoop on Mesos framework will treat every job equal, they require equal CPU, equal Memory and will run in the same TaskTracker (mesos task) environment. This is not actually always the case, and making the framework more intelligent could reap some great benefits...

Per-job environment configuration.
- This is even more useful when you think about Docker+Mesos. The ability to use different docker images for different jobs with different dependencies is very powerful.
Lean resource consumption.
- At the moment you need to cater for the peaks for best performance, especially when dealing with memory. If one out of 100 jobs requires 10x more ram, they must all be allocated the max memory.
The ability to make use of Mesos resource roles.
- This is incredibly powerful as you could essentially tag a set of resources to only be used by a specific set of M/R jobs. Given certain types of SLAs and very long running jobs (>24 hours) this is a useful thing to have... and not currently possible.

Of course spinning up a Hadoop TT for every job might be a little excessive, so the scheduler could be more intelligent and bucket types of jobs to types of task trackers. The Job.Task->TaskTracker assignment would need to change too, I guess.

In doing this the framework starts to become on par with YARN, or even more efficient, as we're able to share TTs between jobs that can share. As far as I'm aware YARN will launch a little JT and TTs for each job you submit? I'm probably wrong though.

The third point (roles) is the one i'm most interested in seeing first.

(Perhaps something for the #MesosCon hackathon :smile:)

brndnmtthws commented 10 years ago

I think having separate per-job docker images would be huge. As would being able to change resources by the jobconf.

However, I should mention that the trouble with both of these is that you must assume the user is savvy enough or cares enough to want to tinker with these things.

The third suggestion is less valuable IMO, because Hadoop's fair scheduler already does a pretty good job of this (if I understand what you're saying).

tarnfeld commented 10 years ago

The third suggestion is less valuable IMO, because Hadoop's fair scheduler already does a pretty good job of this (if I understand what you're saying).

Kind of, I'm only aware of the "pools" feature of the fair scheduler (so correct me if i'm wrong) which we're using to avoid users consuming all of the hadoop cluster and starving production.

The use case I have, for example, is if we have a large series of production jobs that consume resources for >24 hours, I don't want those tasks launching on our normal fast turnaround resources (i.e it's fair since resources are released quickly). I'd like to launch and segment off some mesos resources just for running these tasks, then turn them off after, as that makes things like SLAs and cost management simpler, especially when I know this job will complete within T time if given R resources.

andrewrjones commented 9 years ago

We also have the same requirement as @tarnfeld in our cluster and are planning to make use of Mesos roles to reserve resources for particular Hadoop jobs.

We'll likely need this in place next year and may be able to spend some time implementing it, unless it's been done by someone else before then ;)

mesos / hadoop

Flexible per-job task trackers #31