ietf-tools / xml2rfc

Generate RFCs and IETF drafts from document source in XML according to the IETF xml2rfc v2 and v3 vocabularies
https://ietf-tools.github.io/xml2rfc/
BSD 3-Clause "New" or "Revised" License
65 stars 38 forks source link

fix: Avoid running v2v3 conversion and preptool on prepped documents #1014

Closed kesara closed 1 year ago

kesara commented 1 year ago

Fixes #1013

This fix prevents xml2rfc from running v2v3 conversion and preptool stages on prepped documents. Prepped documents are identified by the prepTime attribute in the rfc element. This fix gains a considerable performance gain.

xml2rfc 3.17.4 (without fix)

root@8bc787553164:~/xml2rfc# time xml2rfc --html --text --pdf rfc9427.xml
 Created file rfc9427.txt
 Created file rfc9427.html
 Created file rfc9427.pdf

real    2m16.207s
user    2m14.799s
sys 0m0.572s
root@8bc787553164:~/xml2rfc# time xml2rfc --html --text --pdf rfc9427.notprepped.xml
 Created file rfc9427.notprepped.txt
 Created file rfc9427.notprepped.html
 Created file rfc9427.notprepped.pdf

real    0m3.785s
user    0m2.611s
sys 0m0.337s

xml2rfc (with the fix)

root@8bc787553164:~/xml2rfc# time xml2rfc --html --text --pdf rfc9427.xml
 Created file rfc9427.txt
 Created file rfc9427.html
 Created file rfc9427.pdf

real    0m2.756s
user    0m1.730s
sys 0m0.208s
root@8bc787553164:~/xml2rfc# time xml2rfc --html --text --pdf rfc9427.notprepped.xml
 Created file rfc9427.notprepped.txt
 Created file rfc9427.notprepped.html
 Created file rfc9427.notprepped.pdf

real    0m3.529s
user    0m2.533s
sys 0m0.194s
root@8bc787553164:
rjsparks commented 1 year ago

why did the date formats change?

kesara commented 1 year ago

why did the date formats change?

Because I removed --legacy-date-format option from the tests in fad6bbbdb3b751c4cc94b4117f93ce679fde2174. The code change had issues with the date format on HTML created by v3 conversions in the test suite. I haven't investigated this further.

kesara commented 1 year ago

Changes look fine.

Curious why this made such a huge difference in time - was there an implicit unprep step when a prepped document came in?

There are a lot of factors. My main suspicion is re-evaluating anchors and ids.