blakeembrey / free-style

Make CSS easier and more maintainable by using JavaScript
MIT License
707 stars 29 forks source link

Hash collision #21

Closed dmitrage closed 8 years ago

dmitrage commented 8 years ago

The example code is not from real application, but anyway

var styleSheet = require('free-style').create()
var a = styleSheet.registerStyle({ color: '#0008d0' })
var b = styleSheet.registerStyle({ color: '#000f82' })
console.log(a, b, styleSheet.getStyles())

//=> "f1pqsan1 f1pqsan1 .f1pqsan1{color:#0008d0}"

Are there any recommended workarounds? Thanks!

blakeembrey commented 8 years ago

Currently, no. I run some tests and it seemed unlikely enough that it would be good overall. This is certainly a valid concern though. I haven't thought about any way to implement this better, the hashes are used as the way to identify content. One of the core designs is that the styles can be loaded independently (E.g. server-side or client-side with whole application or page-only), so using any sort of incremental namespacing is inconsistent.

The only more complete solution would involve namespacing everywhere along the stack and using named styles. E.g. generating things more like .a-b-c-styleName. Otherwise, a better hash algorithm could also be used (though it doesn't solve the overall problem).

basarat commented 8 years ago

A potential workaround is to add a noop style (depending on your use case e.g. opacity or display or pointer or something else). e.g.:

var styleSheet = require('free-style').create()
var a = styleSheet.registerStyle({ color: '#0008d0' })
var b = styleSheet.registerStyle({ color: '#000f82', display:'block' })
console.log(a, b, styleSheet.getStyles())
ajoslin commented 8 years ago

A better hashing algorithm like murmurhash would solve nearly all collision cases.

This specific case is is gone if you use murmur: http://requirebin.com/?gist=e1b61705dbb4a6318dcea35c19955233

I would suggest just doing a straight JSON.stringify + murmurhash. Other tools like aphrodite do this.

I would open a PR right now, but the types are tripping me up.

blakeembrey commented 8 years ago

@ajoslin I'm pretty sure JSON.stringify would only be a negative impact because there's no reason you should be doing it - there's already a string representation you can hash called CSS. Aside from that, you can look in the benchmarks folder (https://github.com/blakeembrey/free-style/tree/master/benchmarks/hashes) - I actually have a test with murmurhash (and others) and found the collisions were higher. If you want to improve the benchmark to prove it's a substantial difference, that'd be great, but I'd need proof it's a better hash than "others are doing this" because I can't see why they would be using it (JSON.stringify and murmurhash are slower, and they're doing double stringification).

ajoslin commented 8 years ago

Yeah nevermind, I'm way off base here.

blakeembrey commented 8 years ago

Here's an example benchmark. The numbers fluctuate as the test is generating CSS dynamically, but murmurhash and the simple string hash seem to be fairly consistent - though the string hash implementation is much smaller (4 lines) and consistently faster.

Ready to run 2399152 tests!
Generated 2398370 hashes using fnv32a with 782 conflicts in 5s 99.46372ms...
Generated 2399152 hashes using fnv32h with 0 conflicts in 12s 360.664747ms...
Generated 2398552 hashes using murmurhash2 with 600 conflicts in 8s 598.311098ms...
Generated 2398465 hashes using murmurhash3 with 687 conflicts in 4s 400.952263ms...
Generated 2398479 hashes using stringHash with 673 conflicts in 3s 885.230015ms...
dmitrage commented 8 years ago

Reducing collision chance and working around existing is important part, but it is part 2. There is also a part 1 - detecting a collision, and I think it is even more important.

As hash collision is a rare exceptional situation, it would be nice if free-style could detect collisions without heavy impact on performance and throw error in this case. I didn't find solution well enough for PR yet.

As a possible way, it can be done in external library/script, used in dev/test environment. I created a gist with implementation draft to show the idea (it wraps free-style adding some checks)

https://gist.github.com/dmitrage/3e56a499b1cba65e383497df54b25fa7

blakeembrey commented 8 years ago

@dmitrage I like detecting the hash collision and logging it out, definitely a good idea for development. I'll make that change. Aside from a future hash change, I can enable it to be a passed in option (in case anyone wants to play with a different hash, for whatever reason) and we can look into 64 bit hashes too (since the key would only grow ~7 characters with each 32 bits).

blakeembrey commented 8 years ago

@dmitrage Want to check out https://github.com/blakeembrey/free-style/pull/25? I implemented hash collision detection and enabled the hash algorithm to be configurable.