python / cpython

The Python programming language
https://www.python.org/
Other
60.06k stars 29.08k forks source link

etree: Defining the default namespace prefix (`""`) with register_namespace has issues #118416

Open danifus opened 3 weeks ago

danifus commented 3 weeks ago

Bug report

Bug description:

Granted the register_namespace() functionality should be used for 'well known namespace prefixes' but if the the default namespace prefix ("") is registered, the following issues can occur:

Duplicate default namespace attrs xmlns=''

import xml.etree.ElementTree as ET
ET.register_namespace("", "default")
e = ET.Element("{default}elem")
print(ET.tostring(e, default_namespace="otherdefault"))
# b'<elem xmlns="otherdefault" xmlns="default" />'

Incorrect serialisation (the noPrefixElem should raise an error but now it looks like it is in the default namespace):

import xml.etree.ElementTree as ET
ET.register_namespace("", "default")
e = ET.Element("{default}elem")
ET.SubElement(e, "noPrefixElem")
print(ET.tostring(e))
# b'<elem xmlns="default"><noPrefixElem /></elem>'

The two approaches to address this that I can think of are:

I'm happy to give the second option a try (with implementing any changes required for #61290) as I've also been looking into #57587 that also needs to handle potentially multiple definitions of the default_namespace. My current thinking is that if default_namespace is provided to _namespaces(), it takes precedence even if the default namespace is also defined in the global registry.

CPython versions tested on:

3.12, CPython main branch

Operating systems tested on:

macOS

Linked PRs