w3c / feedvalidator

W3C-customized version of the feedvalidator (forked from https://github.com/rubys/feedvalidator/)
Other
84 stars 38 forks source link

Validator inappropriately constrains foreign markup in Atom [DuplicateElement and UndefinedElement] #73

Open pa-tna opened 2 years ago

pa-tna commented 2 years ago

Describe the bug According to section 4.1.2 of the Atom syndication format RFC, the atom:entry element allows zero or more "extension elements". The RFC defines "extension elements" in section 6.4 as "foreign markup", which it defines in section 6.1 as "markup from other vocabularies" i.e. not in the Atom namespace). In section 6.4, the RFC says in full:

Atom allows foreign markup anywhere in an Atom document, except where it is explicitly forbidden. Child elements of atom:entry, atom:feed, atom:source, and Person constructs are considered Metadata elements and are described below. Child elements of Person constructs are considered to apply to the construct. The role of other foreign markup is undefined by this specification.

However, when providing an Atom feed containing an atom:entry with more than one dc:language child element, the validator reports the error:

entry contains more than one dc:language [help]

with a link to the DuplicateElement help page.

If instead the Atom feed provided contains more than one dct:language child element, the validator reports the error:

Undefined entry element: dcterms:language (2 occurrences) [help]

with a link to the [UndefinedElement] help page.

To Reproduce

To trigger the DuplicateElement error, paste the following feed into the direct input box on the page :

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <id>http://example.com/data.feed</id>
  <title>Test feed</title>
  <updated>2022-02-11T15:21:00Z</updated>
  <link rel="self" href="http://example.com/data.feed"/>
  <author><name>example.com</name></author>
  <entry>
  <id>http://example.com/item/1</id>
    <title type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <span xml:lang="en">Test title</span> / 
        <span xml:lang="cy">Teitl enghreifftiol</span>
      </div>
    </title>
    <summary type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <span xml:lang="en">Summary</span> / 
        <span xml:lang="cy">Crynodeb</span>
      </div>
    </summary>
    <link rel="alternate" href="http://example.com/item/1.json" type="application/json"/>
    <link rel="alternate" href="http://example.com/item/1.xml" type="application/xml"/>
    <updated>2022-02-11T15:22:00Z</updated>
    <dc:language>en</dc:language>
    <dc:language>cy</dc:language>
  </entry>
</feed>

To trigger the UndefinedElement.html error, delete both instances of dc:language in the above XML and substitute the below XML:

    <dct:language>en</dct:language>
    <dct:language>cy</dct:language>

Expected behavior The feed validates if an atom:entry contains two or more dc:language or dct:language elements.

pa-tna commented 2 years ago

(I missed out the xmlns:dct="http://purl.org/dc/terms/" namespace declaration in the above example, but with it present the problem above still occurs)

abiiget commented 1 year ago

Describe the bug According to section 4.1.2 of the Atom syndication format RFC, the atom:entry element allows zero or more "extension elements". The RFC defines "extension elements" in section 6.4 as "foreign markup", which it defines in section 6.1 as "markup from other vocabularies" i.e. not in the Atom namespace). In section 6.4, the RFC says in full:

Atom allows foreign markup anywhere in an Atom document, except where it is explicitly forbidden. Child elements of atom:entry, atom:feed, atom:source, and Person constructs are considered Metadata elements and are described below. Child elements of Person constructs are considered to apply to the construct. The role of other foreign markup is undefined by this specification.

However, when providing an Atom feed containing an atom:entry with more than one dc:language child element, the validator reports the error:

entry contains more than one dc:language [help]

with a link to the DuplicateElement help page.

If instead the Atom feed provided contains more than one dct:language child element, the validator reports the error:

Undefined entry element: dcterms:language (2 occurrences) [help]

with a link to the [UndefinedElement] help page.

To Reproduce

To trigger the DuplicateElement error, paste the following feed into the direct input box on the page :

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <id>http://example.com/data.feed</id>
  <title>Test feed</title>
  <updated>2022-02-11T15:21:00Z</updated>
  <link rel="self" href="http://example.com/data.feed"/>
  <author><name>example.com</name></author>
  <entry>
  <id>http://example.com/item/1</id>
    <title type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <span xml:lang="en">Test title</span> / 
        <span xml:lang="cy">Teitl enghreifftiol</span>
      </div>
    </title>
    <summary type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <span xml:lang="en">Summary</span> / 
        <span xml:lang="cy">Crynodeb</span>
      </div>
    </summary>
    <link rel="alternate" href="http://example.com/item/1.json" type="application/json"/>
    <link rel="alternate" href="http://example.com/item/1.xml" type="application/xml"/>
    <updated>2022-02-11T15:22:00Z</updated>
    <dc:language>en</dc:language>
    <dc:language>cy</dc:language>
  </entry>
</feed>

To trigger the UndefinedElement.html error, delete both instances of dc:language in the above XML and substitute the below XML:

    <dct:language>en</dct:language>
    <dct:language>cy</dct:language>

Expected behavior The feed validates if an atom:entry contains two or more dc:language or dct:language elements.