Open JoakimSoderberg opened 6 years ago
Prefix was not the correct term
Seems like the reason is this code: https://github.com/subchen/go-xmldom/blob/e1029cd9087cb9c9c8941b155414b8b8e7ce293a/dom.go#L50-L55
More specifically this line: https://github.com/subchen/go-xmldom/blob/e1029cd9087cb9c9c8941b155414b8b8e7ce293a/dom.go#L52
You are only using the Local
and just dropping the Space
part.
https://golang.org/pkg/encoding/xml/#Name
A Name represents an XML name (Local) annotated with a name space identifier (Space). In tokens returned by Decoder.Token, the Space identifier is given as a canonical URL, not the short prefix used in the document being parsed.
type Name struct { Space, Local string }
Ok, so simply adding Space
does not help. Since that apparently is always a full url. Which equally breaks things:
<http://www.w3.org/2000/svg:svg xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://creativecommons.org/ns#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:svg="http://www.w3.org/2000/svg" xmlns="http://www.w3.org/2000/svg" xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" id="Logo" width="2139" height="2139" viewBox="0 0 2139 2139" version="1.1" http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd:docname="siknas07-alt03.svg" http://www.inkscape.org/namespaces/inkscape:version="0.92.1 r15371" http://www.inkscape.org/namespaces/inkscape:label="Cirkel">
This page contains the following errors:
error on line 2 at column 7: Failed to parse QName 'http:'
Below is a rendering of the page up to the first error.
Not sure if this is just something inherent in the golang API when parsing XML, or if it's controllable.
What happens is that this is read:
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
As well as these attributes from the original:
inkscape:version="0.92.1 r15371"
inkscape:label="Cirkel"
When parsing this, the fully expanded Space is saved. So instead of Space being inkscape
for version
it is the full URL http://www.inkscape.org/namespaces/inkscape
. Which is what Chrome fails on.
Seems like this is just how it works:
https://golang.org/pkg/encoding/xml/#Decoder.Token
Token implements XML name spaces as described by http://www.w3.org/TR/REC-xml-names/. Each of the Name structures contained in the Token has the Space set to the URL identifying its name space when known. If Token encounters an unrecognized name space prefix, it uses the prefix as the Space rather than report an error.
I see it was also mentioned in https://golang.org/pkg/encoding/xml/#Name
A Name represents an XML name (Local) annotated with a name space identifier (Space). In tokens returned by Decoder. Token, the Space identifier is given as a canonical URL, not the short prefix used in the document being parsed.
But then I don't get why chrome thinks it's an invalid XML file. The wonderful world of overly complex standards 🤕
Ok, so the marshalling implementation does this properly. The used the expanded Space
and reverse the process when writing the XML:
https://github.com/golang/go/blob/165ebe65585ec7ae63695fab9e7aabaaad1af57c/src/encoding/xml/marshal.go#L700-L715
That is, instead of simply outputting:
http://www.inkscape.org/namespaces/inkscape:version="0.92.1 r15371"
it first creates:
xlmns:inkscape="http://www.inkscape.org/namespaces/inkscape"
And then uses it properly:
inkscape:version="0.92.1 r15371"
So instead of using your own Attribute
struct here:
https://github.com/subchen/go-xmldom/blob/e1029cd9087cb9c9c8941b155414b8b8e7ce293a/node.go#L14-L17
You should use the encoder/xml
Name one that keeps the Space
in the name:
type Name struct {
Space, Local string
}
Or something similar to that.
Unfortunately that breaks backwards compatability
So the correct way to do this is to keep the Space
, so it can be used when Marshaling.
Sorry for the spamming
Because I cannot find a best way to support xpath for namespace, so I drop the namespace for implementation.
I am trying to add full support for namespace in new branch namespace
.
https://github.com/mantyr/go-xmldom/commit/6bd17e593fab442cebac05814edb84c6994135bc
added namespaces prefix
Reading this XML file (an SVG) and then simply resaving it will write invalid characters in the CSS part.
The error that chrome gives on trying to read this SVG
Line 2: original
rewritten file
Seems like all prefixes are dropped from the attribute names, resulting in duplicate
version
:Becomes
Small program to reproduce this: