ietf-tools / xml2rfc

Generate RFCs and IETF drafts from document source in XML according to the IETF xml2rfc v2 and v3 vocabularies
https://ietf-tools.github.io/xml2rfc/
BSD 3-Clause "New" or "Revised" License
71 stars 39 forks source link

hangs when creating text output (very long string within <li>) #945

Open alicerusso opened 1 year ago

alicerusso commented 1 year ago

Describe the issue

input file: https://www.ietf.org/archive/id/draft-irtf-cfrg-vrf-15.xml running xmlrfc --text, it hangs. hit Ctrl-C and here's the end of the Traceback:

  File "/usr/local/lib/python3.9/site-packages/xml2rfc/utils.py", line 180, in fill
    return "\u2028".join(self.wrap(*args, **kwargs))
  File "/usr/local/lib/python3.9/site-packages/xml2rfc/utils.py", line 161, in wrap
    chunks += self._split(chunk3)
  File "/usr/lib64/python3.9/textwrap.py", line 176, in _split
    chunks = self.wordsep_re.split(text)

side note: appendixes have <ul> with <li>s that contain very long unbroken strings. in a test file (with one of the <ul>s), when each <li> is changed to <li><t>, then it doesn't hang and the output file is generated. (reporting the issue even though <li> might not end up being used for the exact content of this input file.)

Code of Conduct

kesara commented 1 year ago

The algorithm needs to be improved here. xml2rfc does provide a text output but it takes a long time.