yahoo / Oak

A Scalable Concurrent Key-Value Map for Big Data Analytics
Apache License 2.0
268 stars 48 forks source link

reused identical index computation in common serializer and comparator #148

Closed thiyanesh closed 3 years ago

thiyanesh commented 3 years ago

reused identical index computation in common serializer and comparator in OakIntBufferComparator, OakIntBufferSerializer, and OakStringComparator.

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

thiyanesh commented 3 years ago

@sanastas,

please find the separate PR for computation reuse. Few methods in comparator & serializer were recomputing the same offset/index values for the buffers. Currently, i have moved this to single variable and contained the scope within the loop. Do you prefer to

  1. contain scope with multiplication
  2. or elevate scope just outside the loop and perform addition(instead of multiplication)?

Will addition with elevated scope reduce Instruction Level Parallelism due to dependent variable computed across iterations?

sanastas commented 3 years ago

Hi @thiyanesh ,

The code looks good to me. I would prefer second: to elevate scope just outside the loop and perform addition(instead of multiplication).

As to ILP I think it depends on compiler and the underlying hardware, so it would be hard to guess which solution is better performance-wise. You can run Oak benchmarks which use OakStringComparator and OakIntBufferSerializer and probably see the difference between two suggestions. Maybe just "4c-get-copy" benchmark.

Do you need help with running Oak's benchmarks? It is run.sh -b oak -e 4c-get-copy -t "1 4 8 12" where -t is the different numbers of threads you want to try.

sanastas commented 3 years ago

However, if you prefer not to invest in this PR anymore, I am ready to accept the PR as is.

thiyanesh commented 3 years ago

@sanastas, please find the updated PR as per the discussion to use repeated addition in loops.

thiyanesh commented 3 years ago

@sanastas, A small request, if you approve this PR, please "Squash and Merge" to combine this PR into a single commit in master branch.(to suppress unnecessary minor commits in this PR branch) If your teams approach is not to Squash and merge, then i can send another PR with these changes as single commit. Please share about your preferred approach.

sanastas commented 3 years ago

Squashed and merged