ietf-tools / xml2rfc

Generate RFCs and IETF drafts from document source in XML according to the IETF xml2rfc v2 and v3 vocabularies
https://ietf-tools.github.io/xml2rfc/
BSD 3-Clause "New" or "Revised" License
68 stars 38 forks source link

lxml==5.0.0 generates an error for xml:space in spanx with rfc2629.dtd #1070

Open kesara opened 9 months ago

kesara commented 9 months ago

Describe the issue

With lxml release 5.0.0, xml2rfc generates an error when xml:space attribute is used in spanx (deprecated) and <!DOCTYPE rfc SYSTEM "rfc2629.dtd" [ ]> is declared. This causes xml2rfc test suite to fail with lxml==5.0.0. But xml:space attribute in artwork doesn't generate an error.

Example:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
]>
<rfc ipr="trust200902" docName="draft-rathnayake-xmlspace-tests-00" category="exp" tocInclude="true" sortRefs="true" symRefs="true" submissionType="independent">
  <front>
    <title abbrev="xmlspace-tests">xml:space tests</title>

    <author initials="K." surname="Nanayakkara Rathnayake">
      <organization></organization>
      <address>
      </address>
    </author>

    <date year="2024" month="January" day="5"/>

    <abstract>
      <t>Test rendering xml:space attribute</t>
    </abstract>

  </front>

  <middle>
    <section anchor="spanx">
      <name>spanx (depcrecated)</name>
      <t>
        <spanx xml:space="preserve">spanx</spanx>
      </t>
    </section>
    <section anchor="artwork">
      <name>artwork</name>
      <t>
        <artwork xml:space="preserve">
                        ¯\_(ツ)_/¯
        </artwork>
      </t>
    </section>
  </middle>

  <back>
  </back>
</rfc>

xml2rfc output:

...
Warning: /usr/local/lib/python3.10/dist-packages/xml2rfc/templates/rfc2629-other.ent is no longer needed as the special processing of non-ASCII characters has been superseded by direct support for non-ASCII characters in RFCXML.
draft-rathnayake-xmlspace-00.xml(30): Error: Invalid attribute space for element em, at /rfc/middle/section[1]/t/em
/root/xml2rfc/draft-rathnayake-xmlspace-00.xml(4): Error: Invalid document before running preptool.
Unable to complete processing draft-rathnayake-xmlspace-00.xml

Code of Conduct

kesara commented 9 months ago

The root cause of this is that when DTD is present lxml seems to add duplicate namespace attribute entries.

lxml bug: https://bugs.launchpad.net/lxml/+bug/2048693