Masterminds / html5-php

An HTML5 parser and serializer for PHP.
http://masterminds.github.io/html5-php/
Other
1.59k stars 114 forks source link

loadHTML drops first TextNode of HTML fragment string #208

Open biziclop opened 3 years ago

biziclop commented 3 years ago

Hi, thanks for the great project. I'm experimenting with parsing html strings. While I probably should use loadHTMLFragment() instead of loadHTML(), but I think this is still a bug. If I try to loadHTML() a string which does not start with a tag but just with plain text, the first text segment is silently ignored. In the example below, the starting text Aaa disappears from the regenerated HTML string:

require 'vendor/autoload.php';
$html5 = new Masterminds\HTML5();
$html = 'Aaa<br>Bbb<b>Ccc</b>Ddd';
$dom = $html5->loadHTML( $html );

echo $html5->saveHTML( $dom );
/* Result:
<!DOCTYPE html>
<html><br>Bbb<b>Ccc</b>Ddd</html>
*/

echo $dom->saveXML( $dom );
/* Result:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"><br/>Bbb<b>Ccc</b>Ddd</html>
*/
alecpl commented 3 years ago

Duplicate of #166.