TysonAndre / pecl-teds

Tentative Extra Data Structures for php
BSD 3-Clause "New" or "Revised" License
30 stars 4 forks source link

Change strict_hash implementation to handle array cyclic reference edge case #193

Closed TysonAndre closed 2 years ago

TysonAndre commented 2 years ago

If the top level value is an array, and the array contains reference cycles at any nesting depth, then give up on the current hash computation and only hash the parts of the top level that aren't arrays (keys, values)

This still meets the condition that if X === Y, then strict_hash(X) === strict_hash(Y) - collisions are already inevitable due to the https://en.wikipedia.org/wiki/Pigeonhole_principle

e.g. for

$x = [];
$x[1] = &$x;
$y = [1 => $x];

var_dump($x === $y); // true, so we expect the strict_hash to be the same. Only hash the key 1 but don't hash the top level array

// Expected: same hashes are computed
// Observed: Different hashes in Teds 1.2.6
var_dump(Teds\strict_hash($x), Teds\strict_hash($y));
int(-2238224086993845681)
int(8691057668021067674)
TysonAndre commented 2 years ago

@iluuu1994 after I'd mentioned strict_hash, I'd realized there was one last edge case I'd forgotten to unit test for Teds\strict_hash and unique_values/StrictHashMap, but after I implement this change, the condition (X === Y) === true (PHP's === operator) will imply that Teds\strict_hash(X) === Teds\strict_hash(Y) for all php values, including float/array edge cases