Closed jealous closed 5 years ago
Merging #25 into master will decrease coverage by
0.64%
. The diff coverage is51.72%
.
@@ Coverage Diff @@
## master #25 +/- ##
============================================
- Coverage 76.07% 75.42% -0.65%
- Complexity 430 436 +6
============================================
Files 30 30
Lines 1981 2035 +54
Branches 325 332 +7
============================================
+ Hits 1507 1535 +28
- Misses 256 274 +18
- Partials 218 226 +8
Impacted Files | Coverage Δ | Complexity Δ | |
---|---|---|---|
...e/spark/shuffle/SplashShuffleFetcherIterator.scala | 47.61% <40%> (-10.72%) |
8 <4> (+2) |
|
...org/apache/spark/shuffle/SplashShuffleReader.scala | 77.58% <45.45%> (-8.13%) |
8 <1> (+1) |
|
...che/spark/shuffle/SplashShuffleBlockResolver.scala | 77.38% <62.96%> (-2.77%) |
35 <2> (+2) |
|
.../apache/spark/shuffle/local/LocalShuffleFile.scala | 48.57% <0%> (+5.71%) |
14% <0%> (+1%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 381697a...d62a5cb. Read the comment docs.
Yes, I just updated the code to allow the user to configure the folder to dump the files.
Dump the current partition to a temp local folder to allow the developer to diagnose the problem when a shuffle read error happens.
This would help the developer to diagnose problems like data corruption.
Make
dump
a utility function in the resolver and dump partition whenever an exception is caught inSplashShuffleFetcherIterator
. Dump files are named likeshuffle_0_1_2.dump
.Add
dump
call inSplashShuffleReader
to dump the partition if an error happens in inserting records.This closes GH-24.