htacg / tidy-html5

The granddaddy of HTML tools, with support for modern standards
http://www.html-tidy.org
2.72k stars 420 forks source link

Redefined xmlns in output XHTML #1128

Open mikeshaw opened 2 weeks ago

mikeshaw commented 2 weeks ago

Not sure if this is a bug and unable to find prior references.

Source HTML - note the duplicated xmlns:link in the use elements:

<!DOCTYPE html>
<html>
    <head>
        <title>title</title>
    </head>
    <body>
        <svg>
            <use xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#gel-icon-search" href="#gel-icon-search" role="presentation"/>
        </svg>
        <svg>
            <use xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#gel-icon-no" href="#gel-icon-no" role="presentation"/>
        </svg>
    </body>
</html>

When passed into: tidy --output-xhtml 1 the output is:

No warnings or errors were found.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content=
"HTML Tidy for HTML5 for Apple macOS version 5.8.0" />
<title>title</title>
</head>
<body>
<svg>
<use xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xlink=
"http://www.w3.org/1999/xlink" xlink:href="#gel-icon-search" href=
"#gel-icon-search" role="presentation"></use>
</svg> <svg>
<use xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xlink=
"http://www.w3.org/1999/xlink" xlink:href="#gel-icon-no" href=
"#gel-icon-no" role="presentation"></use>
</svg>
</body>
</html>

The duplicated namespaces go unremarked and are reproduced. Are duplicated xmlns legal in XHTML? Piping that same output through xmllint: tidy --output-xhtml 1 | xmllint - Creates fatal errors:

"http://www.w3.org/1999/xlink" xlink:href="#gel-icon-search" href=
                              ^
-:15: parser error : Attribute xmlns:xlink redefined
"http://www.w3.org/1999/xlink" xlink:href="#gel-icon-no" href=
                              ^

I would expect Tidy to drop excess duplicate xmlns:* attributes from the output, as it does when the HTML example above is passed into tidy --input-xml 1 which outputs:

line 8 column 13 - Warning: <use> dropping value "http://www.w3.org/1999/xlink" for repeated attribute "xmlns:xlink"
line 11 column 13 - Warning: <use> dropping value "http://www.w3.org/1999/xlink" for repeated attribute "xmlns:xlink"
Tidy found 2 warnings and 0 errors!

<!DOCTYPE html>
<html>
<head>
<title>title</title>
</head>
<body>
<svg>
<use xmlns:xlink="http://www.w3.org/1999/xlink"
xlink:href="#gel-icon-search" href="#gel-icon-search"
role="presentation" />
</svg>
<svg>
<use xmlns:xlink="http://www.w3.org/1999/xlink"
xlink:href="#gel-icon-no" href="#gel-icon-no"
role="presentation" />
</svg>
</body>
</html>

The duplicated xmlns are noted and dealt-with.

Tidy version: "HTML Tidy for Apple macOS version 5.8.0'

funkyfuture commented 3 days ago

Are duplicated xmlns legal in XHTML?

they are legal in XML. but they're not tidy.

my definition of tidy is that all namespaces are declared once in the root note.