steelbrain / linter

A Base Linter with Cow Powers http://steelbrain.me/linter/
MIT License
1.1k stars 178 forks source link

When validating an XML file with linked schema, Linter does not follow http redirects #1707

Open Conal-Tuohy opened 4 years ago

Conal-Tuohy commented 4 years ago

For example, the TEI consortium publishes HTTP URIs for standard TEI schemas, but when you resolve those URIs you are redirected to an HTTPS URI which then responds with the actual schema. The response to the HTTP URI is a "301" redirect with a message body containing HTML (saying "301 Moved Permanently").

What should happen is that the Linter should follow the 301 redirect and retrieve the actual schema.

What currently happens is that the Linter ignores the 301 response code, and attempts to interpret the HTML response body as a RelaxNG schema, which naturally fails.

Example XML document (which should validate according to the linked schemas):

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_xinclude.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_xinclude.rng" type="application/xml"
    schematypens="http://purl.oclc.org/dsdl/schematron"?>
<TEI xmlns:xi="http://www.w3.org/2001/XInclude" xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
      <fileDesc>
         <titleStmt>
            <title>Title</title>
         </titleStmt>
         <publicationStmt>
            <p>Publication Information</p>
         </publicationStmt>
         <sourceDesc>
            <p>Information about the source</p>
         </sourceDesc>
      </fileDesc>
  </teiHeader>
  <text>
      <body>
         <p>Some text here</p>
      </body>
  </text>
</TEI>

NB if you change the http scheme with https in those schema URIs, it does work.

aminya commented 3 years ago

@Conal-Tuohy This should be done in the linter-* package that you use.