Open rhettinger opened 9 years ago
Currently, ElementTree doesn't support comments and processing instructions in the prolog. That is the typical place to put style-sheets and document type definitions.
It would be used like this:
from xml.etree.ElementTree import ElementTree, Element, Comment, ProcessingInstruction
r = Element('current_observation', version='1.0')
r.text = 'Nothing to see here. Move along.'
t = ElementTree(r)
t.append(ProcessingInstruction('xml-stylesheet', 'href="latest_ob.xsl" type="text/xsl"'))
t.append(Comment('Published at: http://w1.weather.gov/xml/current_obs/KSJC.xml'))
That creates output like this:
<?xml version='1.0' encoding='utf-8'?>
<?xml-stylesheet href="latest_ob.xsl" type="text/xsl"?>
<!--Published at: http://w1.weather.gov/xml/current_obs/KSJC.xml-->
<current_observation version="1.0">
Nothing to see here. Move along.
</current_observation>
The ElementTree class imitates or wraps many methods of the Element class. Since Element.append() and remove() already exist and act on children of the element, I think the new ElementTree methods should be named differently. Maybe something like prolog_append() and prolog_remove()? Or prologue_append() depending on your spelling preferences :P
Also, maybe the new write() calls should add newlines.
FTR, lxml's Element class has addnext() and addprevious() methods which are commonly used for this purpose. But ET can't adopt those due to its different tree model.
I second Martin's comment that ET.append() is a misleading name. It suggests adding stuff to the end, whereas things are actually being inserted before the root element here.
I do agree, however, that this is a helpful feature and that the ElementTree class is a good place to expose it. I propose to provide a "prolog" (that's the spec's spelling) property holding a list that users can fill and modify any way they wish. The serialiser would then validate that all content is proper XML prologue content, and would serialise it in order.
My guess is that lxml would eventually use a MutableSequence here that maps changes directly to the underlying tree (and would thus validate them during modification), but ET can be more lenient, just like it allows arbitrary objects in the text and tail properties which only the serialiser rejects.
Note that newlines can easily be generated on user side by setting the tail of a PI/Comment to "\n". (The serialiser must also validate that the tail text is only allowed whitespace.)
For reference:
This is a duplicate of 9521, but it's difficult to say which ticket is better.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['expert-XML', 'type-feature']
title = 'Let ElementTree prolog include comments and processing instructions'
updated_at =
user = 'https://github.com/rhettinger'
```
bugs.python.org fields:
```python
activity =
actor = 'scoder'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['XML']
creation =
creator = 'rhettinger'
dependencies = []
files = ['39498']
hgrepos = []
issue_num = 24287
keywords = ['patch']
message_count = 4.0
messages = ['244069', '244072', '244085', '340993']
nosy_count = 4.0
nosy_names = ['rhettinger', 'scoder', 'eli.bendersky', 'martin.panter']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = '9521'
type = 'enhancement'
url = 'https://bugs.python.org/issue24287'
versions = ['Python 3.6']
```