amplab / training

Training materials for Strata, AMP Camp, etc
150 stars 121 forks source link

MapPartitionsRDD instead of MappedRDD in "Data Explorations" #197

Closed sbenthall closed 8 years ago

sbenthall commented 8 years ago

Instructions report:

scala> sc
res: spark.SparkContext = spark.SparkContext@470d1f30

scala> val pagecounts = sc.textFile("data/pagecounts")
12/08/17 23:35:14 INFO mapred.FileInputFormat: Total input paths to process : 74
pagecounts: spark.RDD[String] = MappedRDD[1] at textFile at <console>:12

I get this in the shell:

scala> sc
res0: org.apache.spark.SparkContext = org.apache.spark.SparkContext@3220c28

scala> val pagecounts = sc.textFile("data/pagecounts")
pagecounts: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:21

I'm a newbie user with not context which would tell me whether this is a significant problem

FrancisToth commented 8 years ago

Is there any reason why this has been closed ? I have exactly the same issue.