spdx / tools

SPDX Tools
Apache License 2.0
129 stars 70 forks source link

XHTML declaring itself text/html #123

Closed wking closed 6 years ago

wking commented 6 years ago

Spun off from here. We're serving the license pages as HTML:

$ curl -sI https://spdx.org/licenses/preview/ISC.html | grep Content-Type
Content-Type: text/html; charset=UTF-8

and doubling down on that with a meta http-equiv in the template (and a few sibling templates under htmlTemplate. This conflicts with the W3C recommendation that XHTML served as text/html not contain XML declarations (which our templates do). I see two potential solutions:

a. Serve our XHTML as application/xhtml+xml. This works for clients that ask for XHTML, but the W3C recommends only serving it to clients who “explicitly indicate they support this media type”, so we'd need HTML versions to serve to everyone else. b. Drop XHTML and just serve vanilla HTML5.

(a) is a superset of (b), since we'll need a vanilla HTML option to comply with the W3C recommendations for clients who don't explicitly indicate they support XHTML. The question is whether we want to continue to support XHTML in parallel or not. With XML data already available from license-list-XML, I don't see a point to continuing to maintain XHTML output. If that seems like a reasonable position, I can start filing PRs to transition the templates to HTML5.

goneall commented 6 years ago

The question is whether we want to continue to support XHTML in parallel or not.

To be honest, I'm not sure why we are using XML. I picked up support for this from someone much more knowledgeable than myself on HTML. He implemented RDFa and I believe added the XML. I thought the XML was required for RDFa, but doing a quick web search seems to indicate XHTML is supported but not required (see https://www.w3.org/TR/html-rdfa/).

If it turns out RDFa requires XML, we probably need to keep it for compatibility.

The RDFa was introduced years ago (before my time) and I know of a few instances where it is still in use including the SPDX tools. If we were to start over, I would prefer to move to JSON as it is much cleaner than the embedded RDFa, but we would have a migration of unknown impact.

If we can do (b) and support RDFa, I would vote for that option.

goneall commented 6 years ago

moved to https://github.com/spdx/LicenseListPublisher/issues/16