uber / RemoteShuffleService

Remote shuffle service for Apache Spark to store shuffle data on remote servers.
Other
321 stars 100 forks source link

It has error when using Spark 3.1 with RSS Spark-30 branch #34

Closed hbpeng0115 closed 3 years ago

hbpeng0115 commented 3 years ago

When I using Spark 3.1 with RSS Spark-3.0 branch, it has the following error. It seems SparkEnv.get.mapOutputTracker.getMapSizesByRange is no longer existed in Spark 3.1. Does anyone know how to fix it.


java.lang.NoSuchMethodError: org.apache.spark.MapOutputTracker.getMapSizesByRange(IIIII)Lscala/collection/Iterator;
    at org.apache.spark.shuffle.rss.RssUtils$$anon$1.get(RssUtils.scala:93)
    at org.apache.spark.shuffle.rss.RssUtils$$anon$1.get(RssUtils.scala:91)
    at com.uber.rss.util.RetryUtils.retry(RetryUtils.java:118)
    at org.apache.spark.shuffle.rss.RssUtils$.getRssInfoFromMapOutputTracker(RssUtils.scala:91)
    at org.apache.spark.shuffle.rss.BlockDownloaderPartitionRangeRecordIterator.getPartitionRssInfo(BlockDownloaderPartitionRangeRecordIterator.scala:209)
    at org.apache.spark.shuffle.rss.BlockDownloaderPartitionRangeRecordIterator.createBlockDownloaderPartitionRecordIteratorWithoutRetry(BlockDownloaderPartitionRangeRecordIterator.scala:100)
    at org.apache.spark.shuffle.rss.BlockDownloaderPartitionRangeRecordIterator.createBlockDownloaderPartitionRecordIteratorWithRetry(BlockDownloaderPartitionRangeRecordIterator.scala:84)
    at org.apache.spark.shuffle.rss.BlockDownloaderPartitionRangeRecordIterator.<init>(BlockDownloaderPartitionRangeRecordIterator.scala:58)
    at org.apache.spark.shuffle.RssShuffleReader.read(RssShuffleReader.scala:75)
    at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:106)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.rdd.CoalescedRDD.$anonfun$compute$1(CoalescedRDD.scala:99)
    at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
    at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1866)
    at org.apache.spark.rdd.RDD.$anonfun$count$1(RDD.scala:1253)
    at org.apache.spark.rdd.RDD.$anonfun$count$1$adapted(RDD.scala:1253)
    at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2242)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:131)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)```
hiboyang commented 3 years ago

@HuiboPeng it seems Spark 3.1 has some changes causing this, you probably could try to compile RSS with Spark 3.1 at your side and see whether you could fix compile errors?

mayurdb commented 3 years ago

Hi @HuiboPeng There is a change in the Shuffle interface again in Spark 3.1 I have pushed the changes to the branch spark31 https://github.com/uber/RemoteShuffleService/tree/spark31

Can you please check if this solves the issue?