dajobe / raptor

Redland Raptor RDF syntax library
https://librdf.org/raptor/
Other
157 stars 62 forks source link

Redirect URLs are not followed #65

Closed hoijui closed 9 months ago

hoijui commented 9 months ago

Many ontologies use PURLs (usually purl.org or w3id.org). In those cases, the public IRI of the ontology only redirects to the actual hosting of the ontology content. It seems like raptor can not handle that.

See for example curls -L, --location option:

       -L, --location
              (HTTP) If the server reports that the requested page has moved to a different location (indicated with a  Lo‐
              cation:  header  and  a 3XX response code), this option makes curl redo the request on the new place. If used
              together with -i, --include or -I, --head, headers from all requested pages are shown.

so this fails (public IRI):

rapper -i guess https://w3id.org/valueflows/ont/vf#

While this works (the actual hosting location of the IRI above):

rapper -i guess https://lab.allmende.io/valueflows/valueflows/-/raw/master/release-doc-in-process/all_vf.TTL
dajobe commented 9 months ago

I think that website is broken

$ curl -IL https://w3id.org/valueflows/ont/vf#
HTTP/1.1 302 Found
Date: Sun, 18 Feb 2024 00:58:03 GMT
Server: Apache/2.4.29 (Ubuntu)
Access-Control-Allow-Origin: *
Location: https://w3id.org/var/www/w3id.org/valueflows/1https://lab.allmende.io/valueflows/valueflows/-/raw/master/release-doc-in-process/all_vf.
Content-Type: text/html; charset=iso-8859-1

HTTP/1.1 404 Not Found
Date: Sun, 18 Feb 2024 00:58:03 GMT
Server: Apache/2.4.29 (Ubuntu)
Access-Control-Allow-Origin: *
Content-Type: text/html; charset=iso-8859-1
dajobe commented 9 months ago

Also, raptor (via curl or whatever) does follow 30x redirects for http(s).

hoijui commented 9 months ago

aHA! Thank you @dajobe, that was gross negligence from my side, my apologies! Now revising, I found 2 errors in our w3id configuration:

  1. No default format assumed if no Accept header is present in the request -> missing file extension
  2. Not correctly setting our "default proxy" to "" -> https://w3id.org/var/www/w3id.org/valueflows/1 prefix that should not be there