Closed mjwillson closed 9 years ago
The local-mr!
function is for modifying a configuration to use the local in-process MapReduce implementation for REPL-testing "mixed-mode" jobs. The goal is to take a configuration which describes how to use a remote cluster (HDFS and MapReduce framework/jobtracker) then replace just enough of the MR portion to successfully run jobs which (a) can access HDFS, but (b) run locally in-process. Does that clarify things, or am I missing cases where there needs to be additional non-default configuration to successfully run jobs within a REPL process?
Ah, OK I think I got the wrong end of the stick and assumed that local-mr! was required to run a job in local mode in general, but it's just for this special mixed mode. I guess local mode is the default if I just use (parkour.conf/configuration).
Cheers for the clarification anyway, perhaps another one of those little things which people who know about hadoop are assumed to know, but isn't otherwise obvious :)
I was wondering why config properties like hadoop.tmp.dir are overridden in config/local-mr! here?
https://github.com/damballa/parkour/blob/master/src/clojure/parkour/conf.clj#L188
From what I can see these are the default values already (at least in hadoop 2.4.0), but hard-coding them here means that I can't override the tmpdir in my local site config.