epimorphics / elda

Epimorphics implementation of the Linked Data API
Other
53 stars 27 forks source link

Bad request returns 500 server error not 400 bad request #183

Closed ijdickinson closed 7 years ago

ijdickinson commented 7 years ago

I'm looking through our Elda log files, and seeing a lot of requests to the API that are malformed, e.g:

/elda/doc/bathing-water?
_pageSize=600&
_properties=latestSampleAssessment.sampleDateTime.inXSDDateTime
&district=http%3A%2F%2Fdata.ordnancesurvey.co.uk%2Fid%2F7000000000038340%3E%3Dtrue

This contains a ill-formed URL parameter:

&district=<http://data.ordnancesurvey.co.uk/id/7000000000038340>=true>

This should return a status code 400 invalid request, but it actually returns 500 internal server error, which is also what goes in the log file. This matters because I'd like to set up some monitoring on the log file to catch actual server errors, or at least be able to grep through the logs for diagnostic purposes, but these invalid requests are false positives that just create noise.

ehedgehog commented 7 years ago

This turns out to be a problem in ValTranslate, the class that is responsible for converting query parameter value strings to RDF/SPARQL terms. In the case where that value is supposed to be a URI, it did not check that the given string was a legal URI, and the later conversion of that pseudo-URI to a SPARQL term would generate a SPARQL syntax error.

I couldn't find an easy way to check to see if a string is legal as a SPARQL URI, and the Jena IRIFactory is rather stricter than SPARQL (for example it disallows terms with unassigned schemes, eg eh:/something, so I took the brute force approach of running the SPARQL parser over the query "SELECT ( AS ?x) WHERE {}" where URI is the URI string. This is heavyweight, but is invoked at most once per property chain in the query.

ehedgehog commented 7 years ago

!Didn't mean to close it yet. Apparently #fixes is more powerful than I realised.

ehedgehog commented 7 years ago

Closed in Autumn 2017