Closed D3077 closed 3 years ago
shuffle read 427.2M,shuffle write 3.6G。
This issue occurs when the --num-executors, --executor-cores is small.
It will be great if you can share a small sample application to reproduce this. I will also try to reproduce this on my end
Thank you for your reply.
spark-submit --master yarn --deploy-mode cluster \ --driver-memory 5g --num-executors 5 --executor-cores 1 --executor-memory 10g \ --conf spark.jars=hdfs:///tmp/shuffletest/remote-shuffle-service-0.0.9-client.jar \ --conf spark.executor.extraClassPath=remote-shuffle-service-0.0.9-client.jar \ --conf spark.shuffle.manager=org.apache.spark.shuffle.RssShuffleManager \ --conf spark.shuffle.rss.serviceRegistry.type=zookeeper \ --conf spark.shuffle.rss.serviceRegistry.zookeeper.servers=$host:$port \ --conf spark.shuffle.rss.dataCenter=dc1 \ --conf spark.speculation=false \ --conf spark.shuffle.rss.replicas=1 \ --class com.example.operator.Test \ /opt/test/operator-1.0.jar
When a Spark task is submitted in cluster mode and stage information is queried on the Spark UI, the size of shuffle read is smaller than that of shuffle write.
Thank you.
The mergeShuffleReadMetrics processing should be added to the read operation of the RssShuffleReader.
After RSS is enabled for Spark, the shuffle read data displayed on the stage page is inconsistent with the shuffle write data. branch spark30.