tumblr / colossus

I/O and Microservice library for Scala
Apache License 2.0
1.14k stars 96 forks source link

Remove unmapped callback #584

Closed benblack86 closed 7 years ago

benblack86 commented 7 years ago

In the name of less code/concepts I thought I'd remove unmapped callback, which is just a mapped callback with the identity function.

I'm guessing the main reason for the unmapped callback is the need for speed? @DanSimon what do you think?

@DanSimon @aliyakamercan @dxuhuang @nsauro

codecov-io commented 7 years ago

Codecov Report

Merging #584 into develop will increase coverage by 0.12%. The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #584      +/-   ##
===========================================
+ Coverage    84.94%   85.07%   +0.12%     
===========================================
  Files           99       99              
  Lines         4431     4409      -22     
  Branches       367      364       -3     
===========================================
- Hits          3764     3751      -13     
+ Misses         667      658       -9
Impacted Files Coverage Δ
...sus/src/main/scala/colossus/service/Callback.scala 96.22% <100%> (+7.94%) :arrow_up:
...rc/main/scala/colossus/service/ServiceClient.scala 91.6% <100%> (ø) :arrow_up:
...sus/src/main/scala/colossus/streaming/Source.scala 84.72% <100%> (ø) :arrow_up:
...s/src/main/scala/colossus/metrics/Collection.scala 97.43% <0%> (-2.57%) :arrow_down:
.../scala/colossus/metrics/collectors/Histogram.scala 90.57% <0%> (-0.73%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 216f11a...c10328a. Read the comment docs.

DanSimon commented 7 years ago

You're right that this is basically for performance reasons. I made this exact same change a while back and it had a noticeable effect (also led to creating ConstantCallback). Maybe things have changed since then, but we definitely need a benchmark for this.

It would be really nice to get rid of it, since having just one complicated version of Callback would be great. Let's see if the numbers agree.

benblack86 commented 7 years ago

There is no noticeable difference between the versions! However, that is probably because the benchmark only hits the constant callback logic. If this change doesn't affect the TechEmpower result, do we need to worry? Should I collect results just so we know, but we would merge anyway since it cleans up some code?

DanSimon commented 7 years ago

Ah yeah I should have been more specific. You're right that this would not affect the techempower benchmark, but we should create a separate benchmark to specifically test this.

In real life this could have a performance hit when it comes to using clients. ServiceClient always returns an UnmappedCallback, so if we get rid of it we're essentially adding one extra layer of callback execution for every request. It's just some extra lambdas and try/catch, but when in a hot path like this it does become non-trivial.

So ideally I'd like to see 3 benchmarks:

  1. A micro-benchmark that just compares executing an UnmappedCallback vs a MappedCallback with an identity function. Maybe map or flatMap on it as well. Just loop it like 10K times and use System.nanoTime and try to get some consistent numbers.

  2. A more realistic benchmark where we spin up a hello-world service, but instead of using ConstantCallback use mapped and unmapped callbacks, and compare throughput/latency.

  3. Another benchmark where we spin up two services: a hello-world service and a simple proxy service. We'd hit the proxy with requests and see if there's any performance difference with having ServiceClient use mapped or unmapped callbacks.

I can help write some of these since I've done this before with other low-level changes (unfortunately the code I had to do it is lost to the void, but it wasn't much). We should also be sure to test on a variety of hardware.

I would say if the benchmarks are within 1-2% of each other, we can drop UnmappedCallback. Pretty tight margin, but since this is one piece of isolated and well-tested code that changes rarely, I'd rather keep the complexity than lose performance that would potentially affect every service.

benblack86 commented 7 years ago

I setup the first test (https://gist.github.com/anonymous/f8b51d56c9a3458605efeb9749836c81) and I consistently got:

  mapped took 984368341 or 9 each
unmapped took 661457857 or 6 each
constant took 620274827 or 6 each

I'm happy to keep all three versions and just add some comments explaining why three versions exist.