Closed jnvsor closed 8 years ago
I'm sorry, I don't understand the issue, if the marker is there, this might just be something PHP does weirdly, no surprise there :) Kint just represents what data PHP holds internally...
No, it's the marker kint put there to detect recursion - the marker at kintParser::$_marker
Oh god you found the polterbug I've been looking to reproduce for for ages! Thank you, I fixed it, but it will come with a batch of changes later this week!
The really, terribly, scary hard task I have for the past two years is to bring documentation up to date :) I really want to include a notable contributors section to give excellent developers like you due credit and immense thanks for all of your hard work!
A more scary note on recursion handling is that kint is doing something that's theoretically impossible to do correctly. For example:
<?php
$a = ["Array 1"];
$b = ["Array 2"];
$a[] = &$b;
$b[] = &$a;
$c = ["Array 1", &$b];
d($a);
d($b);
d($c);
Clearly $c
is not referenced by $b
, but since kint gets it's arguments through func_get_args
(By copy not reference) there is no way to correctly determine recursion in this case.
I think kint should (Like print_r
, var_dump
and other debug builtins which all take parameters by reference) stop trying to be clever.
To demonstrate why the "Clever" behaviour is probably a bad thing, it didn't take long to find a version that's full on completely wrong output:
<?php
$a = ["Array 1"];
$b = ["Array 2"];
$a[] = &$b;
$b[] = &$a;
$c = [&$b];
d($a);
d($b);
d($c);
There is definitely an edge-case bug in detecting circular references, but it has nothing to do with the way Kint gets its arguments. It's a side effect of a conscious effort to short-circuit
array (2) [
string (7) "Array 1"
array (2) [
string (7) "Array 2"
array (2) [
string (7) "Array 1"
*RECURSION*
]
]
]
into
array (2) [
string (7) "Array 1"
array (2) [
string (7) "Array 2"
array (2) *RECURSION*
]
]
Former is how var_dump behaves, by the way.
This situation can only be fixed one of two ways: render output the ugly var_dump way or rewrite the parsing engine to keep a record of all objects. Kint, as it works now, detects a reference to a higher level and assumes it's a reference to the currently inspected node's parent, and in the case you illustrated, the child node is a reference to its grandparent.
I just spent four hours in recursion detective mode and had to scrap my progress because the universe no longer makes sense.
As this is an edge-case I am not giving it high priority, but am leaving it open.
It does have to do with the way it gets it's arguments.
Because kint gets it's arguments by copy not reference the toplevel element is not the one being referenced but a different identical one...
If kint could take it's params by reference it would be easy, but all the helper functions and Kint::dump
would have to be modified to have a large but limited amount of &$arg4, &$arg5, &$arg6
etc so that's unworkable.
The var_dump way as you've pointed out is ugly - but it's correct.
Alternatively you could do a deep strict equals on the toplevel array - where if the first element passed is an array and you ever hit an array identical to it you presume it's recursion - however this too is inaccurate if there really is an identical array being passed like so:
<?php
$a = ["Array 1"];
$b = ["Array 2"];
$a[] = &$b;
$b[] = &$a;
$c = ["Array 1", &$b];
d($a);
d($b);
d($c);
It does have to do with the way it gets it's arguments.
Dude, I just spent four hours (and the past 6 years) working on this, are you seriously doubting I know my shit?
I just spent the last weekend plus a few hours rewriting the parser from scratch - not only do I know I'm right, I can prove it with a simple test case:
<?php
$a = ["Array 1"];
$b = ["Array 2", &$a];
$a[] = &$b;
echo dump_copy($a); // Parameter copied
echo dump($a); // Parameter referenced
print_r($a); // Parameter copied (Referenced in php7)
var_dump($a); // Parameter copied
function dump(&$val){
static $marker = null;
if ($marker === null)
$marker = uniqid();
if (empty($val))
return '';
if (isset($val[$marker]))
return "RECURSION";
$val[$marker] = true;
$output = "(\n";
foreach ($val as $key => &$item) {
if ($key === $marker) {
continue;
}
if (is_array($item)) {
$output .= dump($item)."\n";
}
else {
$output .= (string) $item."\n";
}
}
unset($val[$marker]);
return $output.")\n";
}
function dump_copy($val){
return dump($val);
}
I don't really see what are you trying to prove here, that references behave differently from values?
I already explained the problem in my previous post, and I'll open a separate issue for it, because the one in OP has been fixed.
@jnvsor have you considered recursion tracking by reference instead of using markers? In this way, you keep references to all of the arrays in the "recursion register", including all of the nested elements that are also arrays. Encountering the same reference for a second time during parsing will mean that you've hit a recursion.
@kktsvetkov This was a v1 bug, and was fixed in the v2 full rewrite.
Encountering the same reference for a second time during parsing will mean that you've hit a recursion.
As far as I'm aware there's no way to tell if two references are the same in PHP, since ===
will just compare array contents. (We can't access raw pointer values in PHP either)
I doubt recursively comparing array contents (Which is itself a recursive operation) will be anywhere near as performant as it is now either.
Also, if there's a bug in the current implementation (either in core parser or in a plugin), people will notice the strange kint\0...
keys in their arrays, but if we did it with references it would manifest as a nigh-untraceable memory leak.
If you know of a way to implement this that works and is performant enough to risk a memory leak, please submit a PR, we can always use more performance.
If you know of a way to implement this that works and is performant enough to risk a memory leak
@jnvsor Sure. How would you advise me to test this risk of memory leak? I've done basic stuff like the example from the first comment up top and does output the same structure as var_dump()
, but that's a rather basic.
I've also run ParserTest.php and the only error I got was about it missing Parser::$marker
which I removed. It does pass the testParseRecursion
case.
~/kt.kt/kint-php/🐱 php vendor/bin/phpunit tests/Parser/ParserTest.php
PHPUnit 7.5.20 by Sebastian Bergmann and contributors.
.E.......................................................... 60 / 60 (100%)
Time: 3.49 seconds, Memory: 12.00 MB
There was 1 error:
1) Kint\Test\Parser\ParserTest::testConstruct
ReflectionException: Property Kint\Parser\Parser::$marker does not exist
/Users/polina/kt.kt/kint-php/tests/Parser/ParserTest.php:59
ERRORS!
Tests: 60, Assertions: 181, Errors: 1.
As far as I'm aware there's no way to tell if two references are the same in PHP, since
===
will just compare array contents.
You are correct, ===
compares contents only and will produce false positives when arrays with the same contents are met again. Perhaps the arrays can be tainted with the markers only when such a match is found in order to detect if the two arrays are copies of one another, OR one is a reference of the other.
Also, I think that there can be fewer arrays tracked for recursion since if all the array elements are scalar (or resources), there is no way for recursion to occur: you actually need "nestable" data types such as arrays and objects to have a recursion.
I'll amend https://github.com/kint-php/kint/pull/343 later today to introduce these changes (and probably change the PR title since there will be markers and hashes, but just at a different place)
@jnvsor I was wondering symfony/var-dumper does this, and it seems they do not do it at all - instead, they are relying on the max depth to prevent going infinitely deeper:
if (++$cursor->hashIndex === $this->maxItemsPerDepth || $cursor->stop) { ...
https://github.com/symfony/var-dumper/blob/master/Cloner/Data.php
If I read this right, the $cursor->hashIndex
track how deep is your dump, and after reaching some limit, it just stops.
This is an interesting alternative, and perhaps something that can be considered: if the max_depth
is set, we can skip recursion detection altogether. The potential downside is that in some cases like dumping $GLOBALS the parser will dive into globals own reference for max_depth
levels deep :)
instead, they are relying on the max depth to prevent going infinitely deeper
Symfony's var-dumper doesn't have a max depth. $this->maxItemsPerDepth
serves a similar purpose to kint's new ArrayLimitPlugin
(Which I totally ripped off of var-dumper :P)
var-dumper dumps breadth first and drops any duplicates with an anchor link to the first (ie. highest found) copy of that variable.
I find this somewhat confusing: Duplicates in different trees will be dumped twice but duplicates that are children of any ancestor will be anchored. Also if the dump is at the end of the page and the anchor doesn't scroll properly and you are left guessing what this is a reference to.
(This is an educated guess mind you, var-dumper's source is even more impenetrable than kint and I'm no expert on it)
We dump depth first and don't try to elide duplicate values unless it's recursion.
The potential downside is that in some cases like dumping $GLOBALS the parser will dive into globals own reference for
max_depth
levels deep
It's worse than that for kint's depth-first approach: It will dive into all the global variables on each depth along with all their children. You're looking at a lot of extra work there.
Don't know why exactly but:
This shows the marker in $a in the second call.
I'm doing an experimental overhaul of the parser (It's looking good) so I'll probably come across more of these things.