mojolicious / mojo

:sparkles: Mojolicious - Perl real-time web framework
https://mojolicious.org
Artistic License 2.0
2.66k stars 576 forks source link

Endless loop in Mojo::DOM caused by root going out of scope #1924

Open jjatria opened 2 years ago

jjatria commented 2 years ago

Steps to reproduce the behavior

The following works as expected:

use Mojo::DOM;
my $dom = Mojo::DOM->new('<div><a></a><b></b></div>')->at('div')
    ->children('a')->map( sub { say $_->tag } );
# OUTPUT:
# a

However, when storing the DOM in an intermediate variable, the code instead goes into an endless loop when doing a string comparison against an undefined value:

use Mojo::DOM;
my $dom = Mojo::DOM->new('<div><a></a><b></b></div>')->at('div');
$dom->children('a')->map( sub { say $_->tag } );
# OUTPUT:
# Use of uninitialized value in string ne at .../Mojo/DOM58/CSS.pm line 259.
# ... endless loop ...

Note that I have only seen this affect calls to ->children with a selector. If no selector is given, then this error is not triggered:

use Mojo::DOM;
my $dom = Mojo::DOM->new('<div><a></a><b></b></div>')->at('div');
$dom->children->map( sub { say $_->tag } );
# OUTPUT:
# a
# b

Possible diagnosis

The warning points to the _root sub in Mojo::DOM::CSS

sub _root {
  my $tree = shift;
  $tree = $tree->[3] while $tree->[0] ne 'root'; # <-- BOOM
  return $tree;
}

The ne the warning complains about is the one trying to find the root of the tree, which has gone out of scope. This suggested the following workaround, which does allow the code to behave as expected:

use Mojo::DOM;
my $dom = Mojo::DOM->new('<div><a></a><b></b></div>');
my $root = $dom->root; # Keep it in scope, must be above next line
$dom = $dom->at('div');
$dom->children('a')->map( sub { say $_->tag } );
# OUTPUT:
# a

Expected behavior

I expected that storing a reference to the DOM after calling ->at would give me an instance that I could use in the same way as the full DOM, except at a different point in the tree.

Actual behavior

Some calls on the intermediate object triggered an endless loop.

skington commented 1 month ago

Came here to +1 this. I factored out "get an element I'm interested into" into a separate function, which returned a subset of the DOM, and calling ->following on the resulting object has either unexpectedly produced nothing or triggered an endless loop spewing out zillions of warning about undefined variables.