ietf-tools / xml2rfc

Generate RFCs and IETF drafts from document source in XML according to the IETF xml2rfc v2 and v3 vocabularies
https://ietf-tools.github.io/xml2rfc/
BSD 3-Clause "New" or "Revised" License
68 stars 38 forks source link

xml2rfc v3.18.1: Even with <dl newline="false">, get new line #1045

Closed lbartholomew-rpc closed 12 months ago

lbartholomew-rpc commented 1 year ago

Describe the issue

xml2rfc v3.18.1: Even with an explicit

, get new line after the
entry. This was not an issue until 3.18.1.

Code of Conduct

lbartholomew-rpc commented 1 year ago

Sorry for whacked-out entry; shouldn't have used brackets. Should read xml2rfc v3.18.1: Even with an explicit newline="false" for the "dl" entry, you get a new line after the "dt" entry. This was not an issue until 3.18.1.

lbartholomew-rpc commented 1 year ago

Update: Issue is consistent in text output but inconsistent in HTML and PDF outputs.

cabo commented 1 year ago

Can't reproduce this.


1.1.  Terminology and Conventions

   Thing:  A physical item that is also available for interaction over a
      network.

   sdfThing:  A grouping of sdfObjects (Objects) and/or sdfThings.

   Affordance:  An element of an interface offered for interaction, for
      which information is available (directly or indirectly) that

(From a locally formatted draft. The submission interface appears to be 3.18.0, still.)

cabo commented 1 year ago

Source for this was:


<dl newline="false">
  <dt>Thing:</dt>
  <dd>
    <t>A physical item that is also available for interaction over a network.</t>
  </dd>
  <dt>sdfThing:</dt>
  <dd>
    <t>A grouping of sdfObjects (Objects) and/or sdfThings.</t>
  </dd>
  <dt>Affordance:</dt>
  <dd>
    <t>An element of an interface offered for interaction, for which
information is available (directly or indirectly) that indicates how

(I manually put in the newline="false"; also works with the default.)

Do you have source for an example where that fails?

lbartholomew-rpc commented 1 year ago

Hello.

https://www.rfc-editor.org/v3test/draft-schanzen-gns-28.xml is provided for your reference.

Please also compare (for example) Sections 2, 3, and 4.2 of https://www.rfc-editor.org/v3test/draft-schanzen-gns-28.original with https://www.rfc-editor.org/v3test/draft-schanzen-gns-28.txt https://www.rfc-editor.org/v3test/draft-schanzen-gns-28.html https://www.rfc-editor.org/v3test/draft-schanzen-gns-28.pdf

There shouldn't be any line breaks in the list items in Section 3.

Thank you.

cabo commented 1 year ago

First, I need to congratulate xml2rfc for its taste -- looks much better with the newlines (now if the right arrows in 4 could be fixed as well...).

But to the bug. I first thought the HT characters (which we don't want to have in the XML) had anything to do with the problem, but after removing them the problem persisted.

Weirdly, the newline=false is honored in the TXT for Label: and Name: These differ from the other ones in that they have bcp14 elements. Removing these two dt/dd doesn't fix the others. But I think this is a hint that there is some interaction (which also doesn't hit the sdf example I pointed to above). Looking a bit further...

cabo commented 1 year ago

Well, Label: and Name: also are shorter than 8 characters (but so is Zone:) The code in render_dl in text.py does not make sense to me. It has been changed between 3.18.0 and 3.18.1., I think. I also see changes in the test output that would mesh with the bug. The change is in 02253d8c fix(text): Preserve NBSP in dd (#1023)

-                 and (width - len('  ') - len(text)) < len(c.text.split(None, 1)[0]))
+                 and (width - len('  ') - len(text)) < len(c.text.split(stripspace, 1)[0]))

I think this should be undone for now.

cabo commented 1 year ago

Indeed, stripspace is not a useful argument to split.

The bug that is reintroduced is easy to work around (<dd>&nbsp;</dd><dd><t/></dd>).

lbartholomew-rpc commented 1 year ago

Re. "Weirdly, the newline=false is honored in the TXT for Label: and Name:" -- I get the opposite effect in the text output for draft-ietf-lamps-caa-issuemail-07 (although "label:" is lowercase and in quotes; not sure if that makes a difference). Also, shorter and longer strings don't have the newline; only "label:" has it:

https://www.rfc-editor.org/v3test/draft-ietf-lamps-caa-issuemail-07.form.txt https://www.rfc-editor.org/v3test/draft-ietf-lamps-caa-issuemail-07.form.xml https://www.rfc-editor.org/v3test/draft-ietf-lamps-caa-issuemail-07.form.html https://www.rfc-editor.org/v3test/draft-ietf-lamps-caa-issuemail-07.form.pdf

cabo commented 1 year ago

Yes, what exactly happens appears to be dependent on how long the dd text is. Again, the current code doesn't make sense (and I'm not sure it made a lot of sense in 3.18.0, but false negatives are not a big problem and therefore not very noticeable), so a simple revert of #1023 should be all medicine xml2rfc needs for now.

lbartholomew-rpc commented 1 year ago

@cabo, thank you for looking at it and for the info.!

ajeanmahoney commented 12 months ago

Please escalate a fix for this. The spacing is inconsistent even in the same list, and newlines are only used with longer entries. This is appearing in the txt output only (the html and pdf are fine).

An example (only "Info" has a newline):

   "Input":  The private client input, an opaque byte string.

   "Info":
      The public info, an opaque byte string.  Only present for POPRF
      test vectors.

   "Blind":  The blind value output by Blind(), a serialized scalar of
      Ns bytes long.

   "BlindedElement":  The blinded value output by Blind(), a serialized
      element of Ne bytes long.