phgolardbd / SEBC

cloudera bootcamp - repo phgolard
0 stars 0 forks source link

SEBC Evaluation: Did Not Pass #11

Open mfernest opened 6 years ago

mfernest commented 6 years ago

Hello Pierre-Henri

As you'll see in the reviewed Issues, we marked the first three challenge stages complete. The fourth stage followed the steps but had a critical error in the teragen command invocation. The intent of the following steps were intended as prompts to identify the problem, possibly by calculating ahead of time the expected output: the correct size of each file, the correct number of 64MB blocks, etc.

A provisional pass, as we call it, requires the first four stages to be complete and free of critical errors.

You are welcome to retake the exam if you wish. There is no additional cost for a retake and you are not expect to attend the course again. Please contact me (mfe@cloudera.com) if you wish to attempt the exam again.

Thank you for all your attention and your efforts during the course. It's clear from what you've done so far that it would not take much more work to pass the challenge.

Regards,

Michael & Claudio

phgolardbd commented 6 years ago

Hi Michael,

I do want to retake the exam, but I'll probably need some time to get ready to take it (I think I'm going to retake it end of may since I'm very busy at the moment).

I would also like to find out more about the following points:

[phadmin@elephant ~]$ sudo su - hilary
[hilary@elephant ~]$ time hadoop jar /opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar teragen -Dmapred.reduce.tasks=16 -Dmapreduce.map.memory.mb=768 -Dmapreduce.reduce.memory.mb=768 -Ddfs.blocksize=64000000 65536 /user/hilary/tgen
18/03/16 10:40:28 INFO client.RMProxy: Connecting to ResourceManager at horse.cdh-bootcamp-phg/10.3.4.6:8032
18/03/16 10:40:29 INFO terasort.TeraGen: Generating 65536 using 16
18/03/16 10:40:29 INFO mapreduce.JobSubmitter: number of splits:16
18/03/16 10:40:29 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/03/16 10:40:29 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
18/03/16 10:40:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1521195347957_0001
18/03/16 10:40:30 INFO impl.YarnClientImpl: Submitted application application_1521195347957_0001
18/03/16 10:40:30 INFO mapreduce.Job: The url to track the job: http://horse.cdh-bootcamp-phg:8088/proxy/application_1521195347957_0001/
18/03/16 10:40:30 INFO mapreduce.Job: Running job: job_1521195347957_0001
18/03/16 10:40:37 INFO mapreduce.Job: Job job_1521195347957_0001 running in uber mode : false
18/03/16 10:40:37 INFO mapreduce.Job:  map 0% reduce 0%
18/03/16 10:40:43 INFO mapreduce.Job:  map 13% reduce 0%
18/03/16 10:40:45 INFO mapreduce.Job:  map 31% reduce 0%
18/03/16 10:40:46 INFO mapreduce.Job:  map 44% reduce 0%
18/03/16 10:40:47 INFO mapreduce.Job:  map 81% reduce 0%
18/03/16 10:40:48 INFO mapreduce.Job:  map 94% reduce 0%
18/03/16 10:40:50 INFO mapreduce.Job:  map 100% reduce 0%
18/03/16 10:40:50 INFO mapreduce.Job: Job job_1521195347957_0001 completed successfully
18/03/16 10:40:50 INFO mapreduce.Job: Counters: 31
   File System Counters
      FILE: Number of bytes read=0
      FILE: Number of bytes written=2362550
      FILE: Number of read operations=0
      FILE: Number of large read operations=0
      FILE: Number of write operations=0
      HDFS: Number of bytes read=1326
      HDFS: Number of bytes written=6553600
      HDFS: Number of read operations=64
      HDFS: Number of large read operations=0
      HDFS: Number of write operations=32
   Job Counters
      Launched map tasks=16
      Other local map tasks=16
      Total time spent by all maps in occupied slots (ms)=91326
      Total time spent by all reduces in occupied slots (ms)=0
      Total time spent by all map tasks (ms)=91326
      Total vcore-milliseconds taken by all map tasks=91326
      Total megabyte-milliseconds taken by all map tasks=93517824
   Map-Reduce Framework
      Map input records=65536
      Map output records=65536
      Input split bytes=1326
      Spilled Records=0
      Failed Shuffles=0
      Merged Map outputs=0
      GC time elapsed (ms)=752
      CPU time spent (ms)=11040
      Physical memory (bytes) snapshot=3768381440
      Virtual memory (bytes) snapshot=21700960256
      Total committed heap usage (bytes)=7319060480
   org.apache.hadoop.examples.terasort.TeraGen$Counters
      CHECKSUM=140678493208567
   File Input Format Counters
      Bytes Read=0
   File Output Format Counters
      Bytes Written=6553600

real   0m24.716s
user   0m5.112s
sys    0m0.436s
[hilary@elephant ~]$ hdfs dfs -ls /user/hilary/tgen
Found 17 items
-rw-r--r--   3 hilary hilary          0 2018-03-16 10:40 /user/hilary/tgen/_SUCCESS
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00000
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00001
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00002
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00003
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00004
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00005
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00006
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00007
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00008
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00009
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00010
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00011
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00012
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00013
-rw-r--r--   3 hilary hilary     409600 2018-03-16 10:40 /user/hilary/tgen/part-m-00014
[hilary@elephant ~]$ hadoop fsck -blocks /user/hilary
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Connecting to namenode via http://elephant.cdh-bootcamp-phg:50070/fsck?ugi=hilary&blocks=1&path=%2Fuser%2Fhilary
FSCK started by hilary (auth:SIMPLE) from /10.3.4.5 for path /user/hilary at Fri Mar 16 10:45:02 UTC 2018
.................Status: HEALTHY
 Total size:   6553600 B
 Total dirs:   3
 Total files:  17
 Total symlinks:      0
 Total blocks (validated): 16 (avg. block size 409600 B)
 Minimally replicated blocks:  16 (100.0 %)
 Over-replicated blocks:   0 (0.0 %)
 Under-replicated blocks:  0 (0.0 %)
 Mis-replicated blocks:       0 (0.0 %)
 Default replication factor:   3
 Average block replication:    3.0
 Corrupt blocks:      0
 Missing replicas:    0 (0.0 %)
 Number of data-nodes:    4
 Number of racks:     1
FSCK ended at Fri Mar 16 10:45:02 UTC 2018 in 7 milliseconds

The filesystem under path '/user/hilary' is HEALTHY

Could you please let me know what I did wrong?

Thanks in advance!!

Kind regards,

Pierre-Henri Golard Consultant - Data Engineering [mobile] +32 478 918221<tel:+32%20478%20918221>

pierre-henri.golard@businessdecision.be mailto:pierre-henri.golard@businessdecision.be

[Business and Decision] OUR NEW WEBSITE IS ONLINE! GET [https://i.xink.io/Images/Get/B914/t8.jpg] https://twitter.com/BDBELGIUM [https://i.xink.io/Images/Get/B914/i29.jpg] https://www.instagram.com/businessdecisionbelgium [https://i.xink.io/Images/Get/B914/f7.jpg] https://www.facebook.com/businessdecisionbelgium [https://i.xink.io/Images/Get/B914/l2.jpg] <www.linkedin.com/company/business-&-decision-belgium> [https://i.xink.io/Images/Get/B914/w2.jpg] http://www.businessdecision.be/ SOCIAL WITH US AND FIND US ON:


From: Michael Ernest notifications@github.com Sent: Sunday, April 15, 2018 11:44 PM To: phgolardbd/SEBC Cc: Pierre-Henri Golard; Assign Subject: [phgolardbd/SEBC] SEBC Evaluation: Did Not Pass (#11)

Hello Pierre-Henri

As you'll see in the reviewed Issues, we marked the first three challenge stages complete. The fourth stage followed the steps but had a critical error in the teragen command invocation. The intent of the following steps were intended as prompts to identify the problem, possibly by calculating ahead of time the expected output: the correct size of each file, the correct number of 64MB blocks, etc.

A provisional pass, as we call it, requires the first four stages to be complete and free of critical errors.

You are welcome to retake the exam if you wish. There is no additional cost for a retake and you are not expect to attend the course again. Please contact me (mfe@cloudera.commailto:mfe@cloudera.com) if you wish to attempt the exam again.

Thank you for all your attention and your efforts during the course. It's clear from what you've done so far that it would not take much more work to pass the challenge.

Regards,

Michael & Claudio

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/phgolardbd/SEBC/issues/11, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AjdRGYHyP52XiUGqs7dctjnQH62BO2geks5to79SgaJpZM4TVqVa.