scripting / opml.org

A repository to back up the opml.org website.
MIT License
15 stars 1 forks source link

Validator reports wrong encoding errors on official OPML examples #17

Closed Alkarex closed 5 months ago

Alkarex commented 5 months ago

Hello, The validator wrongly reports some encoding errors on examples given by the OPML specification such as http://hosting.opml.org/dave/spec/subscriptionList.opml

image

Example of supposedly wrong line (which is perfectly fine, XML compliant, and OPML compliant):

<outline text="NYT &gt; Business"
  description="Find breaking news &amp; business news on Wall Street, media &amp; advertising, international business, banking, interest rates, the stock market, currencies &amp; funds."
  htmlUrl="http://www.nytimes.com/pages/business/index.html?partner=rssnyt"
  language="unknown" title="NYT &gt; Business" type="rss" version="RSS2"
  xmlUrl="http://www.nytimes.com/services/xml/rss/nyt/Business.xml"/>

According to the OPML specification http://opml.org/spec2.opml#1629042276000 :

Text attributes may contain encoded HTML markup

At a quick glance, the logic of the validator seems bogus, as it performs an XML parsing (which is good) but then does some manual check of the XML entities, forgetting that those are already handled by the XML parser...

Downstream bug report https://github.com/FreshRSS/FreshRSS/issues/6140#issuecomment-2054202910 (with additional bogus reports from the validator, left for other tickets)

scripting commented 5 months ago

@Alkarex -- Thanks for the report.

The problem should be fixed.