Open martinthomson opened 2 years ago
I did some digging on this and it seems like this is going to be HARD. The lxml library manages HTML serialization and when you enable the pretty_print
option (as xml2rfc does, and should do), something in the creation of the updates/obsoletes element causes lxml to serialize the content of the <dd>
element on the next line:
<dd class="updates">
<a href="https://www.rfc-editor.org/rfc/rfc2119" class="eref">2119</a> (if approved)</dd>
I couldn't work out how to suppress this. It seems to be caused by there being text content in the element. A single <a>
element in updates/obsoletes will render properly once you remove the line that sets a.tail = ' '
, but as soon as you have two or it is a draft (where the tail is set to " (if approved)"), you have text content and lxml serializes on a new line as shown.
I did manage to suppress the leading space on the "published" element by removing the tail on the <time>
element. This turns out to be added if the original <date>
element from which it was created also included trailing text, which is usually just a newline. That's counter-intuitive, but a consequence of how the conversion works, so that can be tweaked:
# Publication date
date = x.find('date')
date.tail = None
pubdate = self.render_date(None, date)
entry(dl, 'Published', pubdate)
I now see id="identifiers"
, which clashes with document IDs:
https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/291
That id="identifiers" thing seems pretty serious and might be worth a different issue.
(On this issue, I've a workaround for this in styling. It is an abomination, but it does work well enough, assuming that you have CSS grid and flexbox and a few other things that shouldn't be necessary but end up being essential.)
Describe the issue
The HTML rendering of the identifiers block (
<dl class="identifiers">
) includes a number of plain textual items, plus a few items that use nested elements. Some of the generated<dd>
elements include additional whitespace before an initial, inline child element, which is hard (or maybe impossible) to remove with styling. This leads to misalignment in rendering.Items that include this extra whitespace are:
<dd class="published">
, which includes a<time>
element as a child. (Though not<dd class="expires">
for some reason.)<dd class="obsoletes">
and<dd class="updates">
, which include<a>
elements and text content.Can this extra space be removed?
Code of Conduct