rhavyn / norbert

Norbert is a cluster manager and networking layer built on top of Zookeeper.
Apache License 2.0
390 stars 125 forks source link

issue 8: reduce memory footprint of fnv hash func #9

Open brentdmiller opened 12 years ago

brentdmiller commented 12 years ago

Although creating immutable objects is frequently better than creating mutable ones, creating immutable objects frequently is not as virtuous.

brentdmiller commented 12 years ago

I tested the change with varying length input arrays of Bytes, including length 0, 1, 2, 4, 1000, and 10K different arrays of len 10K. The new func generated same value as old func in all cases.

I did a super-cheap performance test processing 2KB inputs, and it yielded the following timings to hash the same data with the old & new funcs. 4 runs all showed similar results.

TIME FOR OLD FUNC: real 0m45.628s user 0m54.195s sys 0m0.435s

TIME FOR NEW FUNC: real 0m6.631s user 0m14.914s sys 0m0.270s

Here's the command I used: time scala < hashtest.scala.new > new_fnv.out; time scala < hashtest.scala > old_fnv.out

And here is what the hashtest.scala scripts look like: def old_fnv[T <% Array[Byte]](bytes: T): Int = { val FNV_BASIS = 0x811c9dc5 val FNV_PRIME = (1 << 24) + 0x193

def fnv(key: Array[Byte], hash: Long): Int = {
  if (key.length == 0) hash.toInt
  else fnv(key.drop(1), (hash ^ (0xFF & key.head)) * FNV_PRIME)
}

fnv(bytes, FNV_BASIS)

}

def new_fnv[T <% Array[Byte]](bytes: T): Int = { val FNV_BASIS = 0x811c9dc5 val FNV_PRIME = (1 << 24) + 0x193

var hash: Long = FNV_BASIS
var i: Int = 0
var maxIdx: Int = bytes.length

while (i < maxIdx) {
  hash = (hash ^ (0xFF & bytes(i))) * FNV_PRIME
  i += 1
}
hash.toInt

}

val input0:Array[Byte] = Array() val input1:Array[Byte] = Array(4) val input2:Array[Byte] = Array(99, -99) val input4:Array[Byte] = Array(1, 5, 126, 100)

val input1000:Array[Byte] = new Array(1000) for (i <- 0 to 999) { input1000(i) = (3*i).toByte }

val input2000:Array[Byte] = new Array(2000) for (i <- 0 to 1999) { input2000(i) = new_fnv(input2000).toByte if (input2000(i) == 0) input2000(i) = 1.toByte } input2000

new_fnv(input0) new_fnv(input1) new_fnv(input2) new_fnv(input4) for (i <- 0 to 2000) { new_fnv(input2000) } new_fnv(input2000)