luminousmen / luminousmen.com

2 stars 0 forks source link

https://luminousmen.com/post/spark-partitions #11

Closed utterances-bot closed 3 weeks ago

utterances-bot commented 4 years ago

Spark Partitions - Blog | luminousmen

Partition is the main unit of parallelism in Apache Spark, and this is the main reason why spark is able to perform tasks on hundreds of machines in a cluster

https://luminousmen.com/post/spark-partitions

sachinaraballi commented 4 years ago

Nice article. Please update the outputs of the python program as well as it is easy to understand.

zizhaof commented 4 years ago

Thank you for sharing your experience and knowledge! I did learn a lot from your blogs

PDzikus commented 3 years ago

Very good article, thank you very much. One thing to possibly correct - I was lost a bit when you started talking about memory sizes. 30 GB ~ 30000 Mb part was confusing a bit, because of the size od single letter :) Mb = Megabits and it's not the same as MB = Mega Bytes. 1 MB = 8 Mb. So I assumed when reading further that Mb = MB in this article. I still think it would be a good idea to correct it though.

Kiollpt commented 3 years ago

I think Only one partition is processed by one executor at a time is incorrect. If the executor have two cores, then it could process two partitions at the same time.

yifang0-0 commented 3 years ago
repartitioned = transactions.repartition(8)
print('Number of partitions: {}'.format(repartitidoned.rdd.getNumPartitions()))
print('Partitions structure: {}'.format(repartitioned.rdd.glom().collect()))

This part has a little fault: the second line should be print('Number of partitions: {}'.format(repartitioned.rdd.getNumPartitions())) where repartitioned is mistakenly written.

sreenivasre commented 3 years ago

Thanks for sharing your knowledge!! one small correction: if one executor had 3 cores then it will be 3 partitions.

IremErturk commented 2 years ago

Thanks for the blog post and I really liked the graphics that you used, may i ask which tool you are using for creating these graphics.