mojohaus / xml-maven-plugin

XML Maven Plugin
https://www.mojohaus.org/xml-maven-plugin/
Apache License 2.0
23 stars 21 forks source link

Does not follow redirects when resolving schema for catalog.xml #59

Open runeflobakk opened 3 years ago

runeflobakk commented 3 years ago

I suddenly got the following error when executing validate in build which has previously worked for many years:

[ERROR] Failed to execute goal org.codehaus.mojo:xml-maven-plugin:1.0.2:validate (default) on project X: While parsing projectroot/catalog.xml, at http://www.oasis-open.org/committees/entity/release/1.1/catalog.xsd, line 2,  column 35: error: s4s-elt-character: Non-whitespace characters are not allowed in schema elements other than 'xs:appinfo' and 'xs:documentation'. Saw '301 Moved Permanently'.
[ERROR] While parsing projectroot/catalog.xml, at http://www.oasis-open.org/committees/entity/release/1.1/catalog.xsd, line 4,  column 36: error: s4s-elt-character: Non-whitespace characters are not allowed in schema elements other than 'xs:appinfo' and 'xs:documentation'. Saw '301 Moved Permanently'.
[ERROR] While parsing projectroot/catalog.xml, at http://www.oasis-open.org/committees/entity/release/1.1/catalog.xsd, line 5,  column 20: error: s4s-elt-character: Non-whitespace characters are not allowed in schema elements other than 'xs:appinfo' and 'xs:documentation'. Saw 'nginx'.
[ERROR] While parsing projectroot/catalog.xml, at http://www.oasis-open.org/committees/entity/release/1.1/catalog.xsd, line 6,  column 3: fatal error: The element type "hr" must be terminated by the matching end-tag "</hr>".
[ERROR] While parsing projectroot/catalog.xml, at http://www.oasis-open.org/committees/entity/release/1.1/catalog.xsd, line 6,  column 3: fatal error: The element type "hr" must be terminated by the matching end-tag "</hr>".

This hints about some error related to redirecting: "Saw '301 Moved Permanently'". Executing curl -sLD - http://www.oasis-open.org/committees/entity/release/1.1/catalog.xsd confirms the schema-URL has indeed been moved:

$ curl -sLD - http://www.oasis-open.org/committees/entity/release/1.1/catalog.xsd
HTTP/1.1 301 Moved Permanently
Server: nginx
Date: Wed, 11 Nov 2020 18:31:18 GMT
Content-Type: text/html
Content-Length: 162
Connection: keep-alive
Location: https://www.oasis-open.org/committees/entity/release/1.1/catalog.xsd

HTTP/1.1 200 OK
Server: nginx
Date: Wed, 11 Nov 2020 18:31:19 GMT
Content-Type: application/xml
Content-Length: 8744
Connection: keep-alive
Last-Modified: Wed, 11 Nov 2020 03:21:33 GMT
ETag: "17c9e0-2228-5b3cc4ac9e940"
Accept-Ranges: bytes
Strict-Transport-Security: "max-age=31536000; includeSubDomains" always
...

I have changed the URL in my catalog.xml to https, and this solves the problem, but the http-client used internally by xml-maven-plugin should maybe be set up to follow redirects?

jochenw commented 3 years ago

Just for the record: It appears, that the http URL works again, so we don't have something reproducable.

pzygielo commented 2 years ago

Seeing this during test announced in Redirecting to https on all of www.w3.org

jochenw commented 2 years ago

@runeflobakk Could you provide the relevant snippet from your POM? Catalog files? Whatever else you are using?

runeflobakk commented 2 years ago

The error I encountered about two years ago was when building the digipost/signature-api-specification repository, which runs xml-maven-plugin:validate in the schema sub-module. The pom.xml is not big, and mainly configures the xml-maven-plugin. I was able to work around the problem by changing the schemaLocation URL, as shown in commit 69fb9a769 (where I refer this issue), to what the http URL previously redirected to, to make the plugin directly request the new URL instead of relying on being redirected by the server. As you say, the URL is now working again, so reverting the commit does not break the build.

I do think it would not be controversial to configure any underlying HTTP client to automatically follow redirects. As far as I know, this is the default behavior of e.g. Apache HTTP client. An HTTP server should be allowed to use redirects without worrying about breaking clients. I am not familiar with how HTTP communication is performed internally by xml-maven-plugin. I think following redirects would solve any similar issues, and was my intention originally when reporting this.

I understand you prefer to reproduce the issue to verify a fix, but as you say, the old URL which originally triggered the issue for me now works, so I am not sure how I can provide an example. I guess one could make a reproducing example would be to set up a minimal configuration where a catalog.xml is parsed and its schemaLocation is resolved against a local URL. One could set up a local web server to respond with a redirect response on one path, and another path which actually serves the xsd-file.