multiformats / js-multihashing

Use all the functions in multihash.
MIT License
11 stars 18 forks source link

feat: webcrypto and sha3 #12

Closed dignifiedquire closed 7 years ago

dignifiedquire commented 7 years ago

In Node.js the underlying hashing functions have not changed, but the browser now uses webcrypto instead of JavaScript based methods for SHA1, SHA2-256 and SHA2-512.

Also SHA3 support was added in both Node.js and the browser.

BREAKING CHANGE:

The api was changed to be callback based, as webcrypto only exposes async methods.

Closes #10

Size Impact

non-minified  minified
before 953K  620K
after 328K  208K

Speed Impact

Node.js 6.6.0 Browser Chrome
before tiny
sha1 69,814 ops/sec 15,976 ops/sec
sha256 64,766 ops/sec 10,957 ops/sec
sha512 56,372 ops/sec 17,180 ops/sec
after tiny
sha1 52,057 ops/sec 5,701 ops/sec
sha256 51,105 ops/sec 5,870 ops/sec
sha512 48,676 ops/sec 5,846 ops/sec
before medium
sha1 27,808 ops/sec 2,908 ops/sec
sha256 17,605 ops/sec 2,962 ops/sec
sha512 22,641 ops/sec 852 ops/sec
after medium
sha1 19,619 ops/sec 3,862 ops/sec
sha256 14,980 ops/sec 3,186 ops/sec
sha512 17,420 ops/sec 3,256 ops/sec
before large
sha1 4,330 ops/sec 308 ops/sec
sha256 2,497 ops/sec 323 ops/sec
sha512 3,407 ops/sec 88 ops/sec
after large
sha1 4,192 ops/sec 1,145 ops/sec
sha256 2,472 ops/sec 899 ops/sec
sha512 3,382 ops/sec 1,070 ops/sec

Due to the overhead, on very small buffers the pure JavaScript performs better. When looking at larger inputs it comes apparent that WebCrypto is a good deal faster.

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling dffc5907ad3f815fb450e84ea86ca3ed1a41227b on better-crypto into \ on master**.

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling b0124726b707cdb86966d346ece077053277bcf2 on better-crypto into \ on master**.

daviddias commented 7 years ago

Cool!

Any good reason why to break the interface? We can support a callback API and a sync API at the same time.

Can you add benchmarking tests to justify/motivate this change

PS: Can we get that coveralls comments off?

dignifiedquire commented 7 years ago

The dependency size reduction is massive, I will post numbers later.

Speed wise I haven't tested it, but I expect good things.

We can't have a sync interface on the browser because of webcrypto as we have seen before with my work libp2p-crypto and it would be very confusing when exposing one interface in node and another in the browser.

I don't have the rights to turn the off on coveralls for some reason.

dignifiedquire commented 7 years ago

Another important reason for making the api async and moving to webcrypto is that those avoid blocking the main thread, the current sync implementations will block everthing else, while webcrypto always gets executed on a non render thread.

dignifiedquire commented 7 years ago

@diasdavid benchmarks are there, plus full coverage is back

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling e6667fb18f53a12a35cbc29d3bdc4489b2b8524e on better-crypto into \ on master**.

daviddias commented 7 years ago

@dignifiedquire this is really, really great :D having a faster js-ipfs and libp2p (specially for testing, which means developer produtivity is awesome).

However, as I mentioned in IRC, I don't feel confident of breaking the multihashing API, instead, let's create a js-multihashing-async and reference it from this module as being a faster, smaller version.

For those of you reading this PR "Why create two modules and not have the two API in one?", because having the sync version would mean that we have to keep all the dependencies for the sync code to work, keeping the module really big. It is true that now with things like rollup, you can eliminate dead branches, but also means everyone has to understand and buy into that.

dignifiedquire commented 7 years ago

However, as I mentioned in IRC, I don't feel confident of breaking the multihashing API, instead, let's create a js-multihashing-async and reference it from this module as being a faster, smaller version.

I still don't understand why it is not enough to just bump the semver major version. If we create a second module we have to maintain it + backport fixes to this module including things like sha3. So this means a lot more work in the long time. In addition if others are still using this module and we use the async module in our code base others would have to bundle both versions into their code even increasing their bundles in size.

daviddias commented 7 years ago

I still don't understand why it is not enough to just bump the semver major version

multihashing and the other multiformats implementations should not be seen as a 'internal' of IPFS as they are used by a lot of other groups, that being said, we can't just break user space like that.

If we create a second module we have to maintain it + backport fixes to this module including things like sha3

multihashing is incredibly small and straight forward compared to other modules we maintain, checking up the contribution graph, we can see that most changes have been of code refactoring or dependencies update, it is fairly stable at this point.

In addition if others are still using this module and we use the async module in our code base others would have to bundle both versions into their code even increasing their bundles in size.

That is why documentation (examples, notes on the README) is king, informing our users that we've moved to multihashing-async because of X, Y and Z is more valuable than just flipping the API and implementation under the carpet.

dignifiedquire commented 7 years ago

Create a new repo https://github.com/multiformats/js-multihashing-async and published to npm: https://www.npmjs.com/package/multihashing-async