metanorma / asciidoctor-rfc

AsciiRFC: an AsciiDoc/asciidoctor backend to produce RFC XML v3 (RFC 7991) and v2 (RFC 7749)
BSD 2-Clause "Simplified" License
15 stars 7 forks source link

Fix V2 code to allow PIs and minimize diffs with davies-template-bare-06 #59

Closed ronaldtse closed 6 years ago

ronaldtse commented 6 years ago

I was going to propose our workflow to the RFC editor mailing list, and wanted to give them a good example such as the davies-template-bare-06 file.

However xml2rfc didn't work with the generated file because it lacked XML processing instructions and some formatting was different.

Therefore I made the fixes here to minimize the resulting diff.

Currently the remaining diff (below) are due these issues:

We should fix all these in separate issue tickets and include this file in actual spec to test against XML/TXT outputs.

I'm using this command to generate this diff.

bundle exec bin/asciidoctor-rfc2 spec/examples/davies-template-bare-06.adoc --trace; \
  xml2rfc spec/examples/davies-template-bare-06.xml -o new.txt; \
  xml2rfc spec/examples/davies-template-bare-06.xml.orig -o old.txt; \
  diff old.txt new.txt 

Diff:

Parsing file spec/examples/davies-template-bare-06.xml
WARNING: No DTD given, defaulting to /usr/local/lib/python3.6/site-packages/xml2rfc/templates/rfc2629.dtd
Created file new.txt
Parsing file spec/examples/davies-template-bare-06.xml.orig
Created file old.txt
81c81
<    Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   8
---
>    Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   7
149,150c149,150
<      Tables use ttcol to define column headers and widths.  Every cell
<                   then has a "c" element for its content.
---
>    Tables use ttcol to define column headers and widths.  Every cell
>    then has a "c" element for its content.
152,160c152,160
<                           +----------+----------+
<                           | ttcol #1 | ttcol #2 |
<                           +----------+----------+
<                           |   c #1   |   c #2   |
<                           |   c #3   |   c #4   |
<                           |   c #5   |   c #6   |
<                           +----------+----------+
< 
<                       which is a very simple example.
---
>    +---------------------------------+---------------------------------+
>    |             ttcol #1            |             ttcol #2            |
>    +---------------------------------+---------------------------------+
>    |               c #1              |               c #2              |
>    +---------------------------------+---------------------------------+
>    |               c #3              |               c #4              |
>    +---------------------------------+---------------------------------+
>    |               c #5              |               c #6              |
>    +---------------------------------+---------------------------------+
164c164
< 
---
>    which is a very simple example.
190d189
< 
195c194
<       The quick, brown fox jumped over the lazy dog and lived to fool
---
>    o  The quick, brown fox jumped over the lazy dog and lived to fool
223a223
> 
298d297
< 
318d316
< 
332a331,332
>    [RFC5226] for a guide).  If the draft does not require IANA to do
>    anything, the section contains an explicit statement that this is the
341,342d340
<    [RFC5226] for a guide).  If the draft does not require IANA to do
<    anything, the section contains an explicit statement that this is the
385a384,385
> Author's Address
> 
397,398d396
< Author's Address
< 
447a446,447
> 
> 
ronaldtse commented 6 years ago

An additional issue is the table frame formatting.

Asciidoctor supports the following:

I found that this configuration works:

[cols="2*^", frame="sides", grid="cols"]
|===
|ttcol #1 |ttcol #2

|c #1 |c #2
|c #3 |c #4
|c #5 |c #6
|===

Gives this, which is very close except that the table spans the whole width:

+---------------------------------+---------------------------------+
|             ttcol #1            |             ttcol #2            |
+---------------------------------+---------------------------------+
|               c #1              |               c #2              |
|               c #3              |               c #4              |
|               c #5              |               c #6              |
+---------------------------------+---------------------------------+
opoudjis commented 6 years ago

I'll end up breaking this ticket up, I think, but yes, I was ignoring PIs on the conversion. Empty list format is supported, so this surprises me. I'll go through and work out what's going on.

The table preamble/postamble isn't going to be supported—there just isn't a slot for it in the Asciidoc document model.

ronaldtse commented 6 years ago

Yes @opoudjis please do post separate issues from this PR. It would be great if you could help fix this PR and merge 😉

I don't think Table Preamble/Postamble is that important to support, let's ignore it for now.

opoudjis commented 6 years ago

I wanted to leave the HTML entities in the doc to prove that they are recognised, but they are in the rspec anyway, so no big deal.

ronaldtse commented 6 years ago

Hehe @opoudjis thanks for approving, but the tests are failing 😉

opoudjis commented 6 years ago

The indentation and centering of the table preamble and postamble is different because it is no longer a preamble/postamble --- so it is not at the same level of embedding any more.

Tables use ttcol to define column headers and widths.  Every cell
                   then has a "c" element for its content.
....
which is a very simple example.
opoudjis commented 6 years ago

The table width discrepancy has been dealt with. The empty list discrepancy has been dealt with. The table preamble and postamble will not be dealt with. The remaining diffs are:

81c81
<    Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   8
---
>    Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   7
190d189
< 
223a223
> 
298d297
< 
318d316
< 
332a331,332
>    [RFC5226] for a guide).  If the draft does not require IANA to do
>    anything, the section contains an explicit statement that this is the
341,342d340
<    [RFC5226] for a guide).  If the draft does not require IANA to do
<    anything, the section contains an explicit statement that this is the
385a384,385
> Author's Address
> 
397,398d396
< Author's Address
< 
447a446,447
> 
> 
opoudjis commented 6 years ago

The concatenation of a paragraph to a list was

. Second, a longer list item.
+
And something that looks like a separate pararaph..

But that + only creates a line break. To generate an actual new paragraph, we need an outright empty line:

. Second, a longer list item.
+
{blank}
+
And something that looks like a separate pararaph..
opoudjis commented 6 years ago
<![CDATA[

/**** an example C program */

precedes the comment by two carriage returns, so it corresponds to Asciidoc

----

/**** an example C program */
ronaldtse commented 6 years ago

Nice work @opoudjis ! The text diffs are actually due to the code block where 2 newlines are compressed into one.

You beat me to posting it 👍

opoudjis commented 6 years ago

The only remaining discrepancies are content spilling over a page break by one line:

<      Tables use ttcol to define column headers and widths.  Every cell
<                   then has a "c" element for its content.
---
>    Tables use ttcol to define column headers and widths.  Every cell
>    then has a "c" element for its content.
160,161d159
<                       which is a very simple example.
< 
163a162,163
>    which is a very simple example.
> 
318d317
< 
332a332
>    [RFC5226] for a guide).  If the draft does not require IANA to do
341d340
<    [RFC5226] for a guide).  If the draft does not require IANA to do
391a391
>