ThinkBigAnalytics / pyspark-distributed-kmodes

MIT License
25 stars 23 forks source link

What is sc? #8

Open Chao-Tang opened 4 years ago

Chao-Tang commented 4 years ago

As the title suggests, in the example there were the following code:

rdd = sc.parallelize(data) rdd = rdd.coalesce(max_partitions)

What is sc? I can't see any import nor assignment before it's been called.

musaunal commented 4 years ago

sc = SparkContex

I also can't see any implementation on the source code of pyspark_kmodes it should be imported like that:

from pyspark import SparkContext
sc = SparkContext("local", "Name of your app")