krischer / instaseis

Instant high-frequency seismograms from an AxiSEM database
http://instaseis.net
Other
49 stars 23 forks source link

Use HTTP status code other than 400 for "impossible" request combintations #29

Closed CTrabant closed 8 years ago

CTrabant commented 8 years ago

The scenario: bulk requests where, for example, many source-receivers are requested using a phase-relative time specification; some of the requests maybe be appropriate and some of them may not. This scenario is likely to be common, requesting synthetics for GSN stations using phase-relative times.

The IRIS middleware sends each source-receiver to an Instaseis backend independently. As the IRIS middleware does not check the phase-relatively times for validity (e.g. shadow zone), if any but the very first request to Instaseis fails there is no clean way to indicate this failure to the user client (because the HTTP header and some content has already been sent). What the user will get is truncated miniSEED or a truncated/broken ZIP archive.

There are a couple of ways to address this, such as having Instaseis handle the entire request and validate everything first (that would be a big change for IRIS middleware). Another option is to assume any 400 returned by Instaseis are this "no data for this combination" error because the IRIS middleware has already made sure the request is valid. An easier, more explicit solution would be to have Instaseis return different status codes for the two cases: 1) Bad user input, like bogus values = 400 2) Valid individual input values, but impossible combination = some other code

Currently, these are both returned as HTTP 400 and the IRIS middleware would have to parse the error text to understand which category of error this is. If differentiated the IRIS middleware could cleanly skip the 2nd category requests and continue to deliver the ones that work.

As far as an HTTP code goes the FDSN services use a 204 for the case of "valid input, but no data". Unfortunately that code does not allow returning an error message to clarify why the request was denied, which would be nice. In this case I think a 403 (Forbidden) could be used.

HTTP 403:

The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.

Illustrative examples of the current Instaseis behavior below, both return HTTP 400.

This request:

http://dpstage:8765/seismograms?sourcelatitude=-31.030&sourcelongitude=-71.720&sourcedepthinmeters=30000&sourceforce=1e22,1e22,1e22&receiverlatitude=18.814100&receiverlongitude=98.944300&**starttime=BOGUS**

returns:

400 Parameter 'starttime' must be formatted as: 'Datetime String/Float/Phase+-Offset'

This request:

http://dpstage:8765/seismograms?sourcelatitude=-31.030&sourcelongitude=-71.720&sourcedepthinmeters=30000&sourceforce=1e22,1e22,1e22&receiverlatitude=18.814100&receiverlongitude=98.944300&**starttime=P-10**

returns:

400 No seismograms found for the given phase relative offsets. This could either be due to the chosen phase not existing for the specific source-receiver geometry or arriving too late/with too large offsets if the database is not long enough.

CTrabant commented 8 years ago

A number of small corrections were made to the original bug text, probably worth reading this in the tracker instead of the original notification.

krischer commented 8 years ago

The quickest solution is to just parse the status message - I think the only possibility for the scenario you describe to occur is with the phase relative offsets. So that is the only condition you would need to check for. Let me know if that is not the case.

The cleanest and easiest to maintain solution in my opinion would be to let Instaseis do all the error handling as you already propose. It already has a couple of thousands of lines of test code that mainly checks the various error conditions. But I understand that this is not feasible for the initial release.

What you could also do it just keep looping and collect the error messages. Instaseis should never actually extract a waveform for some invalid/impossible combination so all the error messages come (at most) at the cost of two travel time calculations which should be acceptable. If its always the same error, just return that. Otherwise return the first.

If really necessary and if it greatly ease your life I will add other HTTP error codes but I would like to avoid that as it feels a bit like a hack and the HTTP error codes are not really a fit for non-REST APIs in any case. Let me know if you still want it and I'll add it.

409 - Conflict may be an even "better" fit than 403 (ignoring the piece about the "resource"):

The request could not be completed due to a conflict with the current state of the
resource. This code is only allowed in situations where it is expected that the user
might be able to resolve the conflict and resubmit the request. The response body
SHOULD include enough information for the user to recognize the source of the
conflict. Ideally, the response entity would include enough information for the
user or user agent to fix the problem; however, that might not be possible and is
not required. 
krischer commented 8 years ago

Is this still an issue?

CTrabant commented 8 years ago

Not needed. We ended up skipping any source-receiver combinations and logging them for the user (only SACZIP allows providing the log to the user). It is still not possible to tell the difference between completely invalid input values and correct-but-not-possible, but for now the workaround is fine.