amoffat / hash-n-slash

Proof of concept for converting any content into a domain name.
310 stars 17 forks source link

SHA1? #3

Closed lgarron closed 10 years ago

lgarron commented 10 years ago

When selecting a hash function for a new application, SHA1 is now considered a weak candidate. A well-established alternative would be to use a SHA2 function like SHA256 instead.

michaelsproul commented 10 years ago

Does it matter?

I would err towards using a completely reversible function for this purpose. What if you only have the hashed version of a domain and want to know the original text?

You could instead encrypt url text with a trivial (and published) key to achieve reversibility. In lieu of that, I think a "weak" hash function is fine.

amoffat commented 10 years ago

@lgarron Thanks for the info. I was aware SHA1 was relatively weak, but I was ok using it because no collisions have yet been found. If this idea were ever popular in the real world, using a better hashing algorithm would definitely be a requirement, especially if document hashes were being used as urls. Since this is just a proof of concept, I'll leave it with the weaker SHA1

lgarron commented 10 years ago

Arguably, it's these early decisions when things don't appear to matter that result in issues. Rasmus Lerdorf's recent explanation of PHP function naming comes to mind.

SHA1 isn't completely broken, but SHA256 is a significantly better choice. Since your idea is decentralized (and would get rolling on its own), this is about the only chance to change the de facto function.

Up to you, but it's always sad to see people dismissive of simple security gains just because it doesn't appear to matter at the time.

@gnusouth: In that case, why not just use a UTF8 -> domain name encoding scheme?

amoffat commented 10 years ago

I don't disagree @lgarron. SHA256 is a better choice. If this was a global movement that I was pushing for, I would certainly change it. But as it stands, it is simply a proof of concept to inspire others of interesting possibilities. I have no intent to support it further, so I see no point in changing it.

michaelsproul commented 10 years ago

@lgarron: I agree that a character encoding scheme would be ideal for strings. You could do string -> UTF-8 numerical representation -> base 36 (ala TinyUrl).

Documents present a different problem. Due to size, a direct encoding scheme is immediately infeasible. You also don't want people to be able to easily construct collisions, in this respect, hashing is a good candidate. My previous encryption suggestion is dumb, I agree, but perhaps encryption could be employed in some other way? (identifiers and keys?)

lgarron commented 10 years ago

If you want to make data available based on content, why not just use a torrent network?