Closed andrewmilkowski closed 11 years ago
Have you tried letting this computation finish? IIRC, the reducer progress bar is really 3 separate stages: shuffle, copy, and reduce. The actual computation happens last, so the tree fitting is really only starting at 67%.
I believe computation will finish (I just value my Mac motherbood) I have increased number of reducers to 10 but do note as per comment in the blog...
"Want to also note that seeing one reduce task being created, tried setting D=”mapred.reduce.tasks=10″ in the mapred function
however this caused only 2 reduce tasks to be created (2 R processes) this is still way too small of the number, it won’t scale…
24497 mapred 20 0 667m 481m 4284 R 97.8 12.6 4:58.21 R
24492 mapred 20 0 667m 481m 4284 R 97.1 12.6 4:43.82 R
this is hence looking less and less as rmr2 but a combination of theoretical underpinnings of random forest / input data entropy/structure"
so the computation will finish for sure, they did finish with cutting down on the input data.. question is much larger
Could you open a separate issue for this? Also, could you include your code? (Specifically, how you're specifying the number of reducers.)
sure will (new ticket) Uri! thanks for looking into this..
Hi
seeing following in the tasktracker (while running fitRandomForrest.R)
tail -f hadoop-hadoop-tasktracker-localhost.localdomain.log
2013-09-29 11:11:38,571 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_201309291106_0001m-1871339866 exited with exit code 0. Number of tasks it ran: 1 2013-09-29 11:11:41,190 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201309291106_0001_r_000000_0 0.22222224% reduce > copy (2 of 3 at 49.50 MB/s) > 2013-09-29 11:11:44,708 INFO org.apache.hadoop.mapred.TaskTracker.clienttrace: src: 127.0.0.1:50060, dest: 127.0.0.1:42419, bytes: 342109108, op: MAPRED_SHUFFLE, cliID: attempt_201309291106_0001_m_000002_0, duration: 1901216344 2013-09-29 11:11:45,373 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201309291106_0001_r_000000_0 0.22222224% reduce > copy (2 of 3 at 49.50 MB/s) > 2013-09-29 11:11:45,422 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201309291106_0001_r_000000_0 0.22222224% reduce > copy (2 of 3 at 49.50 MB/s) > 2013-09-29 11:11:47,258 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201309291106_0001_r_000000_0 0.66767067% reduce > reduce 2013-09-29 11:11:50,302 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201309291106_0001_r_000000_0 0.66767067% reduce > reduce 2013-09-29 11:11:59,379 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201309291106_0001_r_000000_0 0.66767067% reduce > reduce 2013-09-29 11:12:02,447 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201309291106_0001_r_000000_0 0.66767067% reduce > reduce 2013-09-29 11:12:56,567 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201309291106_0001_r_000000_0 0.66767067% reduce > reduce 2013-09-29 11:14:44,687 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201309291106_0001_r_000000_0 0.66767067% reduce > reduce
reducer eventually moves to 68% but something internally has gone wrong
my environment:
[amilkowski@localhost hadoop]$ uname -a Linux localhost.localdomain 2.6.32-358.18.1.el6.x86_64 #1 SMP Tue Aug 27 14:23:09 CDT 2013 x86_64 x86_64 x86_64 GNU/Linux
and using cloudera distro: 0.20.2 cdh3u6
please advice, if more log samples/env data is needed please ask will provide
also trying to run debugging.R what would be the procedure to generate training.small.csv for input to troubleshoot this further?
thanks much!