napoler / ganon

Automatically exported from code.google.com/p/ganon
0 stars 0 forks source link

Does not recognize <!DOCTYPE html> as open HTML tag #28

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What will reproduce the problem?
Trying to get nodes inside html tags if document uses html5

If 'file.html' starts with the HTML5 tag.
<!DOCTYPE html>
...
</html>

$html_node = $html('html', 0);
echo gettype($html_node);     // RETURNS NULL

However if the doc is declared with

<html>
...
</html>

it works as intended

What is the expected output? What do you see instead?

Which version are you using?

Please provide any additional information below.

Original issue reported on code.google.com by bruc...@gmail.com on 5 Dec 2012 at 8:56

GoogleCodeExporter commented 9 years ago
Are you sure the first example is valid HTML?

http://www.w3schools.com/tags/tag_doctype.asp
http://dev.w3.org/html5/spec/single-page.html#the-doctype

"The <!DOCTYPE> declaration is not an HTML tag; it is an instruction to the web 
browser about what version of HTML the page is written in."

Do you want Ganon to try to recover the html node from the closing tag?

Original comment by niels....@gmail.com on 7 Dec 2012 at 5:53

GoogleCodeExporter commented 9 years ago
Yes, my mistake. It's not an HTML tag per se.
However, it can still be valid. On these validators:

http://validator.w3.org/nu/
http://validator.w3.org/check

The following validates:
<!DOCTYPE html>
<head>
<title></title>
</head>
<body>
</body>
</html> 

So perhaps it would be nice for Ganon to parse "<!DOCTYPE html>" as an opening 
HTML tag nd make it the root node?

Original comment by bruc...@gmail.com on 8 Dec 2012 at 1:52