wasinger / htmlpagedom

jQuery-inspired DOM manipulation extension for Symfony's Crawler
MIT License
346 stars 50 forks source link

Method previousAll gives nodes in reverse order? #38

Open glensc opened 3 years ago

glensc commented 3 years ago

Given this code:

<?php

use Wa72\HtmlPageDom\HtmlPageCrawler;

require_once __DIR__ . '/vendor/autoload.php';

function repr1(string $html)
{
    $crawler = new htmlPageCrawler($html);
    $tags = $crawler->filter('script[src="embed.js"]');

    $tags->each(function (HtmlPageCrawler $node) {
        $before = $node->previousAll();
        print_r($before->saveHTML());
    });
}

$html = '
<script type="text/javascript" src="embed.php" ></script>
<script type="text/javascript">_load1(42); </script>

<script type="text/javascript">var c = "SPORT";</script>
<script type="text/javascript" src="embed.js"></script>
<script>_load2({width: 517, height: 323, salt: "hDPwYH3j"}); </script>
';

repr1($html);

this outputs:

<script type="text/javascript">var c = "SPORT";</script>
<script type="text/javascript">_load1(42); </script>
<script type="text/javascript" src="embed.php"></script>

but should output in dom order:

<script type="text/javascript" src="embed.php"></script>
<script type="text/javascript">_load1(42); </script>
<script type="text/javascript">var c = "SPORT";</script>

Seems the previousAll returns nodes in order closest to farthest rather first appearing in dom to last appearing in dom?

wasinger commented 2 years ago

The previousAll method is not a method implemented in HtmlPageCrawler but it's inherited from the base Symfony\Component\DomCrawler\Crawler class. So this problem is probably an upstream bug. Didn't find the time yet to investigate further...