GIScience / ohsome-py

Python bindings for the ohsome API
GNU General Public License v3.0
18 stars 4 forks source link

returns malformed json for elements/(geometry_type) requests #173

Closed Chwiggy closed 14 hours ago

Chwiggy commented 1 week ago

Bug Description

The elements extraction endpoint seems to return malformed json, that triggers json decoding exceptions in the ohsome-py wrapper. Specifically the json seems to be missing an expected delimiter deep into the response

General Information

Please include the following general information about the issue and list any additional steps needed to reproduce the bug.

Expected Behaviour

The response should be a json that contains all necessary delimiters

Further Information

Error Messages, Logs, Screenshots

With the following example request:

request_ohsome = functools.partial(<bound method _OhsomePostClient.post of <OhsomeClient: https://api.ohsome.org/v1/>>, bpolys=0    POL...   POLYGON ((8.40355 49.47305, 8.40355 49.47384, ...
dtype: geometry, properties='tags', time='2024-01-01', timeout=60)
id_filter_smoothness = 'id:(way/282619573,way/32277580,way/1132948390,way/27342468,way/32277268,way/28660531,way/27753943,way/175444345)'

the first error message from ohsome-py is:

json.decoder.JSONDecodeError: Expecting ',' delimiter: line 249 column 4 (char 4536)

During handling of that exception the following exception gets raised:

ohsome.exceptions.OhsomeException: OhsomeException (413): A broken response has been received: The given query is too large in respect to the given timeout. Please use a smaller region and/or coarser time period.; "timestamp" : "2024-11-11T11:45:10.307672109"; "requestUrl" : "https://api.ohsome.org/v1/elements/geometry"; "status" : 413
tyrasd commented 3 days ago

As the filter used in the provided query is rather simple, the timeout was probably caused by exceptionally high load on the server at that time. Do you still get it when trying the query today?

malformed json

That is unfortunately a consequence of how the data extraction endpoints work: They start to "stream" the data of the to be extracted elements as soon as they are computed (because the amount of data of an extraction operation can potentially be very large). Then, when one runs into a timeout, there was already some GeoJSON data sent to the user which cannot be undone anymore. This results in the malformed JSON. See also https://docs.ohsome.org/ohsome-api/v1/http-response-status.html#xx-success

That said, it's strange that you got bot an unspecific JSON decoding error and a properly formatted OhsomeException. :thinking:

Which version of ohsome-py are you using? I saw that there have been some improvements regarding error handling in #164 somewhat recently in v3.3.0, maybe that already fixes this case?!

//cc @SlowMo24

SlowMo24 commented 14 hours ago

I think a proper OhsomeException was thrown (it even managed to retrieve the status code for the failure). I think there is confusion as the error message transparently stated that it found out about the error "based on" the invalid json but then continued to state the correct error cause. That is the expected way of how to catch any issues during data streaming as described here.

If you make sure to catch and handle all OhsomeExcpections in your code, it will be fine. For more info you can also look at the ohsome_log directory to see the returned data and information on how to reproduce the query.