Granted the register_namespace() functionality should be used for 'well known namespace prefixes' but if the the default namespace prefix ("") is registered, the following issues can occur:
Duplicate default namespace attrs xmlns=''
import xml.etree.ElementTree as ET
ET.register_namespace("", "default")
e = ET.Element("{default}elem")
print(ET.tostring(e, default_namespace="otherdefault"))
# b'<elem xmlns="otherdefault" xmlns="default" />'
Incorrect serialisation (the noPrefixElem should raise an error but now it looks like it is in the default namespace):
import xml.etree.ElementTree as ET
ET.register_namespace("", "default")
e = ET.Element("{default}elem")
ET.SubElement(e, "noPrefixElem")
print(ET.tostring(e))
# b'<elem xmlns="default"><noPrefixElem /></elem>'
The two approaches to address this that I can think of are:
Raise an error if the default namespace is passed to register_namespace()
Lucky users that register "" but don't use the default prefix's URI anywhere or has every element qualified will have working code with no issues. This would cause their working code to break.
Handle it properly in ElementTree._namespaces() by setting the default_namespace var in that function from the global registry if the default_namespace argument is None
Existing code that 'works' may raise errors about ValueError: cannot use non-qualified names with default_namespace option, but at least erroneous xml would no longer be emitted
It may be possible to make it so code that was luckily unaffected still doesn't raise an error
May result in #61290
I'm happy to give the second option a try (with implementing any changes required for #61290) as I've also been looking into #57587 that also needs to handle potentially multiple definitions of the default_namespace. My current thinking is that if default_namespace is provided to _namespaces(), it takes precedence even if the default namespace is also defined in the global registry.
Bug report
Bug description:
Granted the
register_namespace()
functionality should be used for 'well known namespace prefixes' but if the the default namespace prefix (""
) is registered, the following issues can occur:Duplicate default namespace attrs
xmlns=''
Incorrect serialisation (the
noPrefixElem
should raise an error but now it looks like it is in the default namespace):The two approaches to address this that I can think of are:
register_namespace()
""
but don't use the default prefix's URI anywhere or has every element qualified will have working code with no issues. This would cause their working code to break.ElementTree._namespaces()
by setting the default_namespace var in that function from the global registry if thedefault_namespace
argument isNone
ValueError: cannot use non-qualified names with default_namespace option
, but at least erroneous xml would no longer be emittedI'm happy to give the second option a try (with implementing any changes required for #61290) as I've also been looking into #57587 that also needs to handle potentially multiple definitions of the
default_namespace
. My current thinking is that ifdefault_namespace
is provided to_namespaces()
, it takes precedence even if the default namespace is also defined in the global registry.CPython versions tested on:
3.12, CPython main branch
Operating systems tested on:
macOS
Linked PRs