Chapter 5. Deploying Hadoop - quiz

copyright by edX PageLinuxFoundationX: LFS103x Introduction to Apache Hadoop

Knowledge Check 5.1
1.0/1.0 point (graded)
How can Hadoop be deployed? Select all answers that apply. A. Public Cloud Mode B. Local Mode C. Virtual Machine Mode D. Distributed Mode E. Pseudo-Distributed Mode
Knowledge Check 5.2 1.0/1.0 point (graded)
In local mode, Hadoop is using data from _____. Select the correct answer. A. Blocks stored in HDFS B. Files stored in your local filesystem C. Web via REST APIs D. S3
Knowledge Check 5.3 1.0/1.0 point (graded)
Local mode requires a running YARN. True or False? A. True B. False
Knowledge Check 5.4 0.0/1.0 point (graded)
Which of the following are command line utilities for parallel data processing frameworks? Select all answers that apply. A. hadoop B. hdfs C. spark-shell D. hive E. yarn
Knowledge Check 5.5 1.0/1.0 point (graded)
How can you describe the pseudo-distributed mode? Select all answers that apply. A. A distributed mode running on the world's smallest cluster B. A mode that requires you to use the "hdfs" and "yarn" command line utilities to deploy Hadoop C. A mode that only works for MapReduce parallel data processing framework D. A mode that requires you to run in a Docker container
Knowledge Check 5.6 1.0/1.0 point (graded)
Which of the following tools help with fully distributed Hadoop deployments? Select all answers that apply. A. Apache Spark B. Apache Hive C. Apache Ambari D. Docker E. Apache Bigtop

Key Points to Remember

The main ideas we discussed in this chapter are summarized below:

Getting hands-on with Hadoop is very easy Deploying Hadoop requires installing Java The local mode of Hadoop deployment is very lightweight and it:

…uses the local filesystem as though it was HDFS
...does not require YARN
...is supported by all parallel data processing frameworks, although they cannot really run in parallel The distributed mode of Hadoop deployment is how real production clusters run
You can use either Apache Ambari or Apache Bigtop to automate deploying Hadoop in a distributed mode The pseudo-distributed mode of Hadoop deployment is in-between the local and distributed modes
You can think of it as a distributed mode running on the world’s smallest cluster: one node The hadoop command line utility is used for running MapReduce jobs via hadoop jar MapReduceJob.jar options.. The hdfs command line utility is used for working with data stored in HDFS via hdfs dfs - options… The hdfs and yarn command line utilities are used for managing HDFS and YARN services The spark-shell command line utility is used for working with Apache Spark The hive command line utility is used for working with Apache Hive

nowol79 / MOOC