AngleSharp / AngleSharp.Xml

:angel: Library to add XML and DTD parsing capabilities to AngleSharp.
https://anglesharp.github.io
MIT License
18 stars 6 forks source link

Xml prefixed attributes do not appropriately find namespace #20

Open jbrayfaithlife opened 1 year ago

jbrayfaithlife commented 1 year ago

Bug Report

During Xml parsing, attributes with an xml prefix ought to be associated with the xml namespace, even if such a namespace is not explicitly declared. According to https://www.w3.org/TR/REC-xml-names/#ns-decl :

The prefix xml is by definition bound to the namespace name http://www.w3.org/XML/1998/namespace. It MAY, but need not, be declared, and MUST NOT be bound to any other namespace name. Other prefixes MUST NOT be bound to this namespace name, and it MUST NOT be declared as the default namespace.

Prerequisites

For more information, see the CONTRIBUTING guide.

Description

Xml prefixed attributes ought to be associated with the Xml namespace even if it has not been explicitly declared.

Steps to Reproduce

var xmlParser = new XmlParser();
var doc = xmlParser.ParseDocument("<xml xml:lang=\"en\">Test</xml>");
using (var stringWriter = new StringWriter()){
    doc.ToHtml(stringWriter, new XhtmlMarkupFormatter());
    stringWriter.ToString().Dump();
}

Expected behavior: Output should be <xml xml:lang=\"en\">Test</xml>

Actual behavior: Output is <xml lang="en">Test</xml>

Compare with the output to the following linqpad script

var xmlParser = new XmlParser();
var doc = xmlParser.ParseDocument("<xml xmlns:xml=\"http://www.w3.org/XML/1998/namespace\" xml:lang=\"en\">Test</xml>");
using (var stringWriter = new StringWriter()){
    doc.ToHtml(stringWriter, new XhtmlMarkupFormatter());
    stringWriter.ToString().Dump();
}

Output: <xml xmlns:="http://www.w3.org/XML/1998/namespace" xml:lang="en">Test</xml>

Environment details: Win 10; .NET 6.0.15

Possible Solution

In the XmlDomBuilder we need to replace this code: with this:

if (prefix.Is(NamespaceNames.XmlPrefix))
{
    ns = NamespaceNames.XmlUri;
}
else if (!prefix.Is(NamespaceNames.XmlNsPrefix))
{
    ns = CurrentNode.LookupNamespaceUri(prefix);
}

A PR can be made to this effect with a test by which I confirmed the bug and solution.

FlorianRappl commented 1 year ago

Sounds fine to me!