Open klonos opened 1 year ago
PR that updates any file that belongs to the Views module: https://github.com/backdrop/backdrop/pull/4208
Questions:
md5()
to hash('sha256', ...)
everywhere in core? ...and if so, should we do it here, in a single PR, or separate issue about that? 🤔 md5()
is not to be used, and sha256 preferred in its place, should we be including this in our linting (see #3213, #5296 etc.)?md5()
and hash('sha256')
return a cryptographic hash, misuses of those functions should be easily catch from reviewers. When I see md5()
used in code my thoughts are It's broken. and Is it really necessary? (I personally find harder to catch misuses of other functions.) I would add an automatic check for other cases, not to avoid md5()
is used.I will make an example to explain my second point in the previous comment, using view code.
// Allow hook_views_pre_view() to set the dom_id, then ensure it is set.
$this->dom_id = !empty($this->dom_id) ? $this->dom_id : md5($this->name . REQUEST_TIME . rand());
When I see that code, I think:
That code seems to just create an ID that is possibly unique. There isn't other code that tries to get the same ID starting from the same $this->name
; if that were the case, rand()
wouldn't be used.
I would get a unique ID using with uniqid('', true)
or combining the output of uniqid('', true)
with the output of other functions.
In views, md5()
is never used to get a cryptographic hash; it's rather used to get a 32-byte value that doesn't contain spaces, parentheses, and other characters from a string. In a case, it's used to obtain a possibly unique value, for which the alternative is to use uniqid('', TRUE)
; in the other cases, backdrop_base64_encode()
could instead be used.
In the Drupal issue, merlinofcaos says:
An update should also clear caches and other things where we're using these MD5s for later lookups to avoid confusion.
Also changing those will unexpectedly change people's block deltas. While it seems highly unlikely that anyone is using a silly md5 for their CSS on their view blocks...and yet, I've seen people do weirder things. So we could potentially hurt those sites.
The PR replaces md5()
with hash()
, but it doesn't remove those MD5 hashes stored in the database (cache or other tables). I looked at the code, but I wasn't able to understand if those values are effectively stored in the database.
If an hash function is used because it always produces an output of the same length, independently from the input length, I would use:
If there isn't a non cryptographic hash that could be used, I would use a cryptographic hash.
HAVAL-160,4 and HAVAL-160,5 are cryptographic hashes that are unbroken and produce an output of 40 byte. They aren't vulnerable to length extension attach, contrary to SHA256. (That's what Hash function security summary reports.)
If we just need to avoid MD5 and SHA1, HAVAL-128,4 and HAVAL-128,5 are an alternative. They produce an output of 32 bytes, like MD5.
As for non-cryptographic hashes, PHP 5.6 supports fnv1a64, which outputs 64 bits. For example, changing the following line would be simple.
$this->dom_id = !empty($this->dom_id) ? $this->dom_id : md5($this->name . REQUEST_TIME . rand());
The new code would be the following.
$this->dom_id = !empty($this->dom_id) ? $this->dom_id : hash('fnv1a64', $this->name) . hash('fnv1a64', uniqid('', TRUE));
The double call to hash()
is only necessary to get an hash that is long like the hash returned from md5()
. Otherwise, the code could be simply be the following.
$this->dom_id = !empty($this->dom_id) ? $this->dom_id : hash('fnv1a64', uniqid($this->name, TRUE));
@kiamlaluno I would like to thank you for taking the time to provide feedback and elaborate. I wanted to acknowledge your responses here since it's been some time sine you've posted them, but I don't have any specific thoughts at the moment. I will need to rethink and carefully examine your points before I have anything useful to add.
In the meantime, feedback from others is welcomed as well.
I will add what I found out after further research.
To generate a 128-bit hash, it is better to use SHA3-256 and truncate the hash to 128 bits.
$hash = hash('sha3-256', $input);
$truncated_hash = substr($hash, 0, 16);
The truncated hash is still collision-resistant, contrary to non cryptographic hashes (although the probability of collisions with FNV1a128 is low).
That code requires PHP 7, though, ashash()
does not support SHA3 in PHP 5.6.
This is the respective issue as https://www.drupal.org/project/views/issues/1884828, which will be a backport of what has already been done in the core version of Views in Drupal.
The linked article states:
The article further links to https://www.drupal.org/project/drupal/issues/723802 which states: