fastify / fastify-etag

Automatically generate etags for HTTP responses, for Fastify
MIT License
73 stars 14 forks source link

md5 is faster than fnv1a #91

Open Uzlopak opened 1 year ago

Uzlopak commented 1 year ago

Prerequisites

Issue

We use by default fnv1a for weak hashing. But md5 is faster than fnv1a

Here is a benchmark with an impememtation of the Java hashCode algorithm and crc32

'use strict'

const crc32 = require('crc-32').buf
const crypto = require('crypto')
const benchmark = require('benchmark')

const LoremString = 'Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium.'
const LoremBuffer = Buffer.from(LoremString)

function hashCode (input) {
  let hash = 0
  let i = 0
  const il = input.length
  for (i = 0; i < il; ++i) {
    hash = (((hash << 5) - hash) + input[i]) & 0xFFFFFFFF
  }

  return hash >>> 0
}

const OFFSET_BASIS_32 = 2166136261

function fnv1a (buf) {
  let hash = OFFSET_BASIS_32

  for (let i = 0; i < buf.length;) {
    hash ^= buf[i++]

    // 32-bit FNV prime: 2**24 + 2**8 + 0x93 = 16777619
    // Using bitshift for accuracy and performance. Numbers in JS suck.
    hash += (hash << 1) + (hash << 4) + (hash << 7) + (hash << 8) + (hash << 24)
  }

  return hash >>> 0
}

new benchmark.Suite()
  .add('crc32', function () {
    crc32(LoremBuffer) >>> 0
  }, { minSamples: 100 })
  .add('md5', function () {
    crypto.createHash('md5').update(LoremBuffer).digest().toString('base64')
  }, { minSamples: 100 })
  .add('hashCode', function () {
    hashCode(LoremBuffer)
  }, { minSamples: 100 })
  .add('fnv1a', function () {
   fnv1a(LoremBuffer)
  }, { minSamples: 100 })
  .on('cycle', function onCycle(event) { console.log(String(event.target)) })
  .run()

result:

node benchmarks/weakHash.js

crc32 x 1,314,889 ops/sec ±0.19% (195 runs sampled) md5 x 293,537 ops/sec ±2.10% (175 runs sampled) hashCode x 1,019,507 ops/sec ±0.40% (190 runs sampled) fnv1a x 154,034 ops/sec ±0.11% (195 runs sampled)

jsumners commented 1 year ago

See https://stackoverflow.com/a/44171095

Uzlopak commented 1 year ago

Well md5 has a 128bit digest and fnv1a has a 32 bit digest. Thats why I looked for crc32 and hashCode as they are also have 32 bit digests.

Here with the in SO-Thread mentioned murmurhash

btw.: renamed hashCode to djb2

murmurhash x 766,803 ops/sec ±0.41% (194 runs sampled) crc32 x 1,310,009 ops/sec ±0.29% (195 runs sampled) md5 x 286,639 ops/sec ±2.97% (178 runs sampled) djb2 x 1,033,964 ops/sec ±0.24% (194 runs sampled) fnv1a x 152,752 ops/sec ±0.05% (194 runs sampled)

'use strict'

const crc32 = require('crc-32').buf
const crypto = require('crypto')
const benchmark = require('benchmark')

const LoremString = 'Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium.'
const LoremBuffer = Buffer.from(LoremString)

function djb2 (input) {
  let hash = 0
  let i = 0
  const il = input.length
  for (i = 0; i < il; ++i) {
    hash = (((hash << 5) - hash) + input[i]) & 0xFFFFFFFF
  }

  return hash >>> 0
}

const OFFSET_BASIS_32 = 2166136261

function fnv1a (buf) {
  let hash = OFFSET_BASIS_32

  for (let i = 0; i < buf.length;) {
    hash ^= buf[i++]

    // 32-bit FNV prime: 2**24 + 2**8 + 0x93 = 16777619
    // Using bitshift for accuracy and performance. Numbers in JS suck.
    hash += (hash << 1) + (hash << 4) + (hash << 7) + (hash << 8) + (hash << 24)
  }

  return hash >>> 0
}

function murmurhash2_32_gc(buf, seed) {
  var
    l = buf.length,
    h = seed ^ l,
    i = 0,
    k;

  while (l >= 4) {
    k = 
      ((buf[i] & 0xff)) |
      ((buf[++i] & 0xff) << 8) |
      ((buf[++i] & 0xff) << 16) |
      ((buf[++i] & 0xff) << 24);

    k = (((k & 0xffff) * 0x5bd1e995) + ((((k >>> 16) * 0x5bd1e995) & 0xffff) << 16));
    k ^= k >>> 24;
    k = (((k & 0xffff) * 0x5bd1e995) + ((((k >>> 16) * 0x5bd1e995) & 0xffff) << 16));

    h = (((h & 0xffff) * 0x5bd1e995) + ((((h >>> 16) * 0x5bd1e995) & 0xffff) << 16)) ^ k;

    l -= 4;
    ++i;
  }

  switch (l) {
  case 3: h ^= (buf[i + 2] & 0xff) << 16;
  case 2: h ^= (buf[i + 1] & 0xff) << 8;
  case 1: h ^= (buf[i] & 0xff);
          h = (((h & 0xffff) * 0x5bd1e995) + ((((h >>> 16) * 0x5bd1e995) & 0xffff) << 16));
  }

  h ^= h >>> 13;
  h = (((h & 0xffff) * 0x5bd1e995) + ((((h >>> 16) * 0x5bd1e995) & 0xffff) << 16));
  h ^= h >>> 15;

  return h >>> 0;
}

new benchmark.Suite()
  .add('murmurhash', function () {
    murmurhash2_32_gc(LoremBuffer)
  }, { minSamples: 100 })
  .add('crc32', function () {
    crc32(LoremBuffer) >>> 0
  }, { minSamples: 100 })
  .add('md5', function () {
    crypto.createHash('md5').update(LoremBuffer).digest().toString('base64')
  }, { minSamples: 100 })
  .add('djb2', function () {
    djb2(LoremBuffer)
  }, { minSamples: 100 })
  .add('fnv1a', function () {
   fnv1a(LoremBuffer)
  }, { minSamples: 100 })
  .on('cycle', function onCycle(event) { console.log(String(event.target)) })
  .run()
climba03003 commented 1 year ago

If a new algorithm is needed because of speed. I will go for Murmur2. It has better randomness, fair collision problem and faster speed.

Uzlopak commented 1 year ago

Do you have a good performant implemention of murmur2 you recommend?

mcollina commented 1 year ago

The 32bit digest is great for this case. Adding bytes to the response will slow down transfers long term.

climba03003 commented 1 year ago

Do you have a good performant implemention of murmur2 you recommend?

No, I vote on the algorithm. But I never use it in my application.

The comparison is pretty solid and I believe it is a good choice

Refs #3

Uzlopak commented 1 year ago

Well my recommendation is not to set md5 as default hashing algorithm. It is just, that it seems odd, that the Weak entity tag is 50% slower than using md5. Even hashing it through md5 and then just taking 4 bytes should be faster than fnv1a. So I assume that at the time fastify-etag was developed md5/sha1 was slower in nodejs than doing the fnv1a hashing. When i now benchmark it, it seems that it is the total opposite.

So I am actualy suggesting to replace fnv1a with a faster weak hashing algorithm. Currently the crc32 implementation seems to be the fastest on my machine. According to the comparison of @climba03003 murmur can be faster. So lets see if there is somewhere already a faster implementation than the one i found

climba03003 commented 1 year ago

Here is benchmark using the benchmarks/run.js. I believe the result of comparing plan hashing method is contradicting because the payload is not compared in small to large.

You can see the in 32b / 2k, murmur2 and fnv1a is nearly the same. In 2M, murmur2 start out perform fnv1a and near to md5.

32 bytes
--------

  string
  ------
   1. No etag             21279.20 req/sec
   2. murmur2             20739.24 req/sec       -2.538%
   3. fnv1a               20208.20 req/sec       -5.033%
   4. md5                 16854.41 req/sec      -20.794%

  buffer
  ------
   1. No etag             20950.29 req/sec
   2. murmur2             19791.81 req/sec       -5.530%
   3. fnv1a               19394.60 req/sec       -7.426%
   4. md5                 16271.00 req/sec      -22.335%

2 kb
----

  string
  ------
   1. No etag             18067.00 req/sec
   2. murmur2             15429.60 req/sec      -14.598%
   3. fnv1a               14124.00 req/sec      -21.824%
   4. md5                 13760.00 req/sec      -23.839%

  buffer
  ------
   1. No etag             20701.41 req/sec
   2. murmur2             17877.80 req/sec      -13.640%
   3. md5                 15624.80 req/sec      -24.523%
   4. fnv1a               10357.71 req/sec      -49.966%

2 Mb
----

  string
  ------
   1. No etag                73.28 req/sec
   2. md5                    49.95 req/sec      -31.837%
   3. murmur2                48.15 req/sec      -34.293%
   4. fnv1a                  36.65 req/sec      -49.986%

  buffer
  ------
   1. No etag                74.00 req/sec
   2. md5                    70.11 req/sec       -5.257%
   3. murmur2                69.74 req/sec       -5.757%
   4. fnv1a                  51.16 req/sec      -30.865%
mcollina commented 1 year ago

When are we sending JSON data of 2MBs? It does not seem a usual case.

Anyway, I'm happy to add murmur2 to the options (why not murmur3? https://en.m.wikipedia.org/wiki/MurmurHash)

Uzlopak commented 1 year ago

I have a local branch with integrated murmur2 and jbl2. Crc32 is loaded from the package crc-32. Do you want it?

mcollina commented 1 year ago

I prefer not to take an additional dependency. How about adding an option to specify a function to compute this hash?

kurtextrem commented 1 year ago

I benchmarked on node v18.13.0, M1 Pro:


32 bytes
--------

  string
  ------
   1. crc32               55889.80 req/sec
   2. murmur3             54688.80 req/sec       -2.149%
   3. fnv1a               54016.80 req/sec       -3.351%
   4. murmur2             53529.00 req/sec       -4.224%
   5. djb2                50503.60 req/sec       -9.637%
   6. sha1                48489.40 req/sec      -13.241%
   7. md5                 47425.40 req/sec      -15.145%

  buffer
  ------
   1. crc32               55871.00 req/sec
   2. murmur2             55805.60 req/sec       -0.117%
   3. djb2                55018.20 req/sec       -1.526%
   4. MurmurHash3         54400.20 req/sec       -2.632%
   5. murmur3             54174.60 req/sec       -3.036%
   6. fnv1a               53179.20 req/sec       -4.818%
   7. md5                 48790.80 req/sec      -12.672%
   8. sha1                48017.20 req/sec      -14.057%
   9. murmur3wasm         20159.20 req/sec      -63.918%

2 kb
----

  string
  ------
   1. sha1                34247.81 req/sec
   2. crc32               32182.60 req/sec       -6.030%
   3. murmur2             31616.00 req/sec       -7.685%
   4. murmur3             30362.74 req/sec      -11.344%
   5. md5                 29260.00 req/sec      -14.564%
   6. fnv1a               23937.20 req/sec      -30.106%
   7. djb2                 8023.52 req/sec      -76.572%

  buffer
  ------
   1. crc32               49125.40 req/sec
   2. djb2                48597.60 req/sec       -1.074%
   3. sha1                46698.20 req/sec       -4.941%
   4. murmur3             44641.00 req/sec       -9.128%
   5. murmur2             44610.20 req/sec       -9.191%
   6. md5                 39599.60 req/sec      -19.391%
   7. MurmurHash3         21560.00 req/sec      -56.112%
   8. murmur3wasm         19975.91 req/sec      -59.337%
   9. fnv1a               19789.86 req/sec      -59.716%

2 Mb
----

  string
  ------
   1. sha1                   98.54 req/sec
   2. md5                    69.89 req/sec      -29.074%
   3. murmur2                64.95 req/sec      -34.088%
   4. crc32                  63.44 req/sec      -35.620%
   5. murmur3                47.88 req/sec      -51.411%
   6. fnv1a                  36.80 req/sec      -62.655%
   7. djb2                    0.03 req/sec      -99.970% (396 errors - 396 timeouts)

  buffer
  ------
   1. murmur3wasm         20414.20 req/sec
   2. djb2                  116.00 req/sec      -99.432%
   3. sha1                  111.75 req/sec      -99.453%
   4. murmur3               110.26 req/sec      -99.460%
   5. murmur2                87.45 req/sec      -99.572%
   6. MurmurHash3            86.62 req/sec      -99.576%
   7. crc32                  85.84 req/sec      -99.580%
   8. md5                    81.49 req/sec      -99.601%
   9. fnv1a                  33.44 req/sec      -99.836%
Uzlopak commented 1 year ago

@kurtextrem Any recommendations?

kurtextrem commented 1 year ago

I'd go for:

Also, since fnv1a is very slow, I'd say we should remove it. Since sha1 isn't drastically slower (apart from the 2 MB benchmark) in Buffer scenarios, I think setting the default to sha1 is actually the most sensible action to take, as we don't need additional dependencies.

Uzlopak commented 1 year ago

@climba03003 @mcollina

Opinions?

climba03003 commented 1 year ago

It would be either one of them. Not conditionally choose based on type and size.

I still see murmur2 is a good choice. It is not the best, but it sit in the top in most case.

kurtextrem commented 1 year ago

Since murmur2 would introduce another dependency, I still see sha1 as the best option as sensible default (for real-world usage) as it's dependency free (available in the crypto module).

mcollina commented 1 year ago

I don't know how you are running those benchmarks, can we get a PR to verify/check the results?

These results make sense: for small payloads running a pure-JS implementation outperforms all the native ones. For large payloads, running a native implementation outperforms the pure JS ones.

My take is we should optimize for the 2KB - 256KB range, which usually is the outpuf of bundlers, looking at the results above, using sha1 seems the best option.

kurtextrem commented 1 year ago

@mcollina I can't make a PR because, 1) I fucked up the source code by having the prettier config from my company running 2) I did changes to the index in order to be able to benchmark the other algo's. However you can verify the results by running it from my fork: https://github.com/kurtextrem/fastify-etag. Here's the map for the algo's in my fork: https://github.com/kurtextrem/fastify-etag/blob/master/index.mjs#L37 I didn't change sha1, md5 and fnv1a though, so the benchmark from this repo here should return the same results.

But -- I'd definitely be happy to submit a correct PR for changing the default.

mcollina commented 1 year ago

Here are the result from my benchmark machine:

32 bytes
--------

  string
  ------
   1. No etag             26702.25 req/sec
   2. fnv1a               23425.76 req/sec      -12.270%
   3. sha1                19780.00 req/sec      -25.924%
   4. sha256              19779.81 req/sec      -25.925%
   5. sha224              19723.61 req/sec      -26.135%
   6. sha384              19657.08 req/sec      -26.384%
   7. sha512              19603.50 req/sec      -26.585%
   8. md5                 19555.71 req/sec      -26.764%

  buffer
  ------
   1. No etag             26291.71 req/sec
   2. fnv1a               23044.49 req/sec      -12.351%
   3. md5                 19804.98 req/sec      -24.672%
   4. sha384              19668.70 req/sec      -25.190%
   5. sha1                19615.71 req/sec      -25.392%
   6. sha224              19453.37 req/sec      -26.009%
   7. sha512              19308.00 req/sec      -26.562%
   8. sha256              19260.79 req/sec      -26.742%

2 kb
----

  string
  ------
   1. No etag             20744.98 req/sec
   2. fnv1a               16212.30 req/sec      -21.850%
   3. sha1                15991.81 req/sec      -22.912%
   4. md5                 15621.18 req/sec      -24.699%
   5. sha384              15367.13 req/sec      -25.924%
   6. sha512              15207.32 req/sec      -26.694%
   7. sha224              15172.40 req/sec      -26.862%
   8. sha256              14864.30 req/sec      -28.347%

  buffer
  ------
   1. No etag             25879.32 req/sec
   2. sha1                19172.69 req/sec      -25.915%
   3. sha384              18763.42 req/sec      -27.496%
   4. md5                 18624.88 req/sec      -28.032%
   5. sha512              18373.57 req/sec      -29.003%
   6. sha256              17935.32 req/sec      -30.696%
   7. sha224              17868.40 req/sec      -30.955%
   8. fnv1a               11239.37 req/sec      -56.570%

2 Mb
----

  string
  ------
   1. No etag               113.34 req/sec
   2. sha1                   57.50 req/sec      -49.268%
   3. md5                    53.90 req/sec      -52.444%
   4. sha512                 53.45 req/sec      -52.841%
   5. sha384                 53.20 req/sec      -53.062%
   6. sha224                 48.95 req/sec      -56.811%
   7. sha256                 48.40 req/sec      -57.297%
   8. fnv1a                  41.35 req/sec      -63.517%

  buffer
  ------
   1. md5                   135.96 req/sec
   2. sha1                  134.31 req/sec       -1.214%
   3. sha512                133.74 req/sec       -1.633%
   4. No etag               131.44 req/sec       -3.325%
   5. sha224                129.67 req/sec       -4.626%
   6. sha384                127.75 req/sec       -6.039%
   7. fnv1a                  58.74 req/sec      -56.796%

In Node v18 it seels that md5 or sha1 offers good results.

kurtextrem commented 1 year ago
2 kb
----

  string
  ------
   1. sha1                32769.81 req/sec
   2. md5                 28926.80 req/sec      -11.727%
   3. fnv1a               23341.60 req/sec      -28.771%

  buffer
  ------
   1. sha1                46000.80 req/sec
   2. md5                 38303.40 req/sec      -16.733%
   3. fnv1a               19560.80 req/sec      -57.477%

256 kb
----

  string
  ------
   1. sha1                  677.98 req/sec
   2. md5                   501.80 req/sec      -25.986%
   3. fnv1a                 291.78 req/sec      -56.963%

  buffer
  ------
   1. sha1                 1139.04 req/sec
   2. md5                  1133.30 req/sec       -0.504%
   3. fnv1a                 244.78 req/sec      -78.510%

I benchmarked 2KB and new 256KB on Node 18.13 on the same M1 Pro. I've just realized two things however: 1) I'm not sure if my M1 Pro reflects the real-world usage, no server runs Apple silicon 2) Node 18.13 added a fastpath for UTF-8 encoding, which might impact this benchmark on older nodejs versions

kurtextrem commented 1 year ago

Our benchmarks align with this benchmark: https://medium.com/@chris_72272/what-is-the-fastest-node-js-hashing-algorithm-c15c1a0e164e & https://blog.shevlyagin.com/2021/10/28/fastest-node-js-hashing-algorithm-for-large-strings/.

I took the benchmark to my home PC:

2 kb
----

  string
  ------
   1. sha1                21075.20 req/sec
   2. fnv1a               20305.20 req/sec       -3.654%
   3. md5                 19845.76 req/sec       -5.834%

  buffer
  ------
   1. sha1                24938.35 req/sec
   2. md5                 24402.54 req/sec       -2.149%
   3. fnv1a               21871.50 req/sec      -12.298%

256 kb
------

  string
  ------
   1. sha1                  589.40 req/sec
   2. md5                   505.48 req/sec      -14.238%
   3. fnv1a                 406.55 req/sec      -31.023%

  buffer
  ------
   1. md5                  5134.00 req/sec
   2. sha1                 4855.23 req/sec       -5.430%
   3. fnv1a                 527.85 req/sec      -89.719%

So I'd say sha1 is the overall winner (unless you use a bigger buffer).

Uzlopak commented 1 year ago

Just want to remark, that sha1 should always result in about 52 chars etag (160 bits = 40 bytes + 1/3 base64 overhead).

kurtextrem commented 1 year ago

I see 28 chars (withouit W/ and "): W/"3femZI5zoOdXITuJUkW25TZi0NM="

Uzlopak commented 1 year ago

Hmm. Well i have no objection tbh.

jimmiehansson commented 5 months ago

Higher performance at the cost of higher collision rate

Uzlopak commented 5 months ago

@jimmiehansson So? It is not like we want to hash passwords. If we have a collision in etag it just means that the server will send one more time the requested data

jimmiehansson commented 5 months ago

@Uzlopak You don't need any password for cache deception or smuggling. Either way, there are a number of well known surfaces of attacks prone against hash algorithms with higher collision rates, e.g. https://en.wikipedia.org/wiki/Preimage_attack Besides, who here can anticipate what the contents of the cache is used for beforehand?

kurtextrem commented 5 months ago

What is the attack vector? by guessing an etag, you don't get back the body, and crafting a collision also doesn't tell you anything else that might be security related

Uzlopak commented 5 months ago

Exactly: It is not like etag is used to request a cached request. The contrary, if etag matches you dont get more info. If the etag does not match you get the response. So not sending the etag at all, you get information.

There is no attack vector.

jimmiehansson commented 5 months ago

I do see your points, but I wasn't so much referring to the contents being exposed as sensitive, but the possibility of exploitation to the cache. But I'm sure it will be fine either way MD5, if the performance of the etag entropy makes the bigger difference.

jimmiehansson commented 5 months ago

What is the attack vector? by guessing an etag, you don't get back the body, and crafting a collision also doesn't tell you anything else that might be security related

Mainly the abuse of the request handlers state at the time of the cache-hit/miss by claiming higher CPU as part of a flood. Vector or not, its perfectly doable

jimmiehansson commented 5 months ago

Exactly: It is not like etag is used to request a cached request. The contrary, if etag matches you dont get more info. If the etag does not match you get the response. So not sending the etag at all, you get information.

There is no attack vector.

Ok, sounds good with me!

johaven commented 2 months ago

I'm curious, have you tried with the hash function since this patch: https://github.com/nodejs/node/blob/main/doc/changelogs/CHANGELOG_V20.md#crypto-implement-cryptohash?