willemdj / erlsom

XML parser for Erlang
GNU Lesser General Public License v3.0
265 stars 103 forks source link

make it work for schemas from www.w3.org #52

Closed srenatus closed 8 years ago

srenatus commented 8 years ago

Trying to use erlsom with this schema, I noticed that it doesn't support getting a schema using http://.

Adding a clause for http:// I then found that this schema (referenced from the first one) makes it choke because the server returns HTTP 500.

Reproducing the request of httpc:request() with curl:

$ curl -v http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/xmldsig-core-schema.xsd -H "content-length: 0" -H "te;" -H "connection: keep-alive" -H "host: www.w3.org" -H "user-agent:"
*   Trying 128.30.52.100...
* Connected to www.w3.org (128.30.52.100) port 80 (#0)
> GET /TR/2002/REC-xmldsig-core-20020212/xmldsig-core-schema.xsd HTTP/1.1
> host: www.w3.org
> Accept: */*
> content-length: 0
> te:
> connection: keep-alive
>
* HTTP 1.0, assume close after body
< HTTP/1.0 500 Server Error
< Cache-Control: no-cache
< Connection: close
< Content-Type: text/html
<
<html><body><h1>500 Server Error</h1>
An internal server error occured.
</body></html>
* Closing connection 0
$

The relevant bit seems to be the missing user agent. I've thus added a user agent to the request issued by erlsom_lib:getFile/2.

It might seem weird to add this special measure just for one server out there, but it's W3C, they do host many XSDs...

willemdj commented 8 years ago

Hi Stephan,

Yes, that looks like a very reasonable suggestion. I'll have a closer look, one of these days, and most likely I will then accept this request.

Thanks, Willem