Open ttim opened 7 years ago
Merging #740 into develop will decrease coverage by
2.1%
. The diff coverage is0%
.
@@ Coverage Diff @@
## develop #740 +/- ##
==========================================
- Coverage 72.21% 70.1% -2.11%
==========================================
Files 153 158 +5
Lines 3736 3824 +88
Branches 204 205 +1
==========================================
- Hits 2698 2681 -17
- Misses 1038 1143 +105
Impacted Files | Coverage Δ | |
---|---|---|
...d/online/executor/ComposedFlatMapPerformance.scala | 0% <0%> (ø) |
|
...er/summingbird/online/executor/SimpleFlatMap.scala | 0% <0%> (ø) |
|
.../executor/ComposedFlatMapWithTimePerformance.scala | 0% <0%> (ø) |
|
...witter/summingbird/online/executor/TestUtils.scala | 0% <0%> (ø) |
|
...d/online/executor/IdentityFlatMapPerformance.scala | 0% <0%> (ø) |
|
...ain/scala/com/twitter/summingbird/TestGraphs.scala | 77.7% <0%> (-10.83%) |
:arrow_down: |
...ala/com/twitter/summingbird/scalding/Service.scala | 73.84% <0%> (-4.62%) |
:arrow_down: |
... and 4 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update fecdfa3...f3e4845. Read the comment docs.
I'm fine with merging this, but should we add a test that the SimpleFlatMap is identical in performance to the IntermediateFlatMap? Someone might use it as is, and currently it is untested.
Is this a candidate for running in our actual planner, or just a limit to shoot for?
@johnynek According to tests SimpleFlatMap
is much more performant than IntermediateFlatMap
=)
Anyway I moved it to perf module for now (and made it private). Let's see later do we want to use it instead of IntermediateFlatMap
in special cases.
Sorry, I meant identical in function, I realize it is much faster.
@johnynek yeah =) issue is - it's not identical because our FlatMapOperation
able to handle leftJoin
in a same way as flatMap
s, so it covers only simple cases.
I guess it's safe to merge, because it's private from external users for now.
@ttim we are starting to really push on summingbird performance.
Any more work here? Would it make sense to have a video sync sometime soon as we are currently investing in summingbird at Stripe (but also considering if that makes sense or to shift investment somewhat).
If we can share effort, that is a big win. If we each are just going to work privately on our forks maybe each should make their own call.
Thoughts @pankajroark ?
@johnynek Great to hear about Summingbird performance push. Happy to set up a video sync. As you've seen we've made some optimizations and we wanted to make more. Unfortunately we have to focus resources on other projects for the time being, but we plan to come back and make more optimizations. It will be great to learn about what you've planned. Let me sync up with you offline to set up a time for video sync.
@pankajroark sounds good. I replied offline about a meeting time.
This PR adds tests for
flatMap
/flatMap
composition in context ofOperationContainer
. Tests are implemented using JMH.Tests are not ideal, but they give us some estimation of costs our abstractions introduce.
Most interesting overheads:
IntermediateFlatMap
withFlatMapOperation
introduces about 1s per 1KK operation overhead.FlatMapOperation
composition adds another ~1-2s per 1KK.