ICRAR / ijson

Iterative JSON parser with Pythonic interfaces
http://pypi.python.org/pypi/ijson/
Other
830 stars 51 forks source link

Iterate over more than one prefix? #109

Open hotaru355 opened 5 months ago

hotaru355 commented 5 months ago

Description Using higher level interfaces, is there a way to iterate over more than one prefix?

Detailed description First of all, thank you very much for this great library!

I am using ijson to transform a long JSON response from a graphQL server into python objects. Unfortunately, if the server encounters an error, the response status code is still 200 with the response body containing some JSON error message. So, the response can be in one of two formats:

# good response
{"data": [...]}

# bad response
{"errors": [...]}
  1. Is there a way to transform the data for a good response and raise an error in case there is a bad response?
  2. I know I can achieve the desired behaviour using the low-level parse function. The issues I have with this approach is that

    1. It feels like I have to reverse-engineer the higher level interface. Since I would like to process dicts, just like the high-level interfaces provide, iterating the parse results would need to build dictionaries out of token events
    2. I am concerned that I loose the performance benefit from running C instead of python code

    How bad would you say is the performance loss when building dictionaries from token events compared to the C implementation? Is it even worth using ijson in this case?

  3. If not already requested, I would like to follow up this question with a feature request that enhances the high-level interfaces to something like:
    for prefix, item in ijson.items(f, {'earth.europe.item', 'earth.america.item'}):
    if prefix == 'earth.europe.item':
      do_something_with_european_country(item)
    elif prefix == 'earth.america.item':
      raise ValueError(f"Did not expect American country: {item}")

Why is this not clear from the documentation The use case is not mentioned

rtobar commented 5 months ago

@hotaru355 I'm currently just messaging to acknowledge that I've seen this message and I'm aware of it. I'm however unable to properly reply for a couple of weeks, so don't expect any action on this immediately. Thanks!

rtobar commented 4 months ago

An update, a month later, an update: Yes, the idea of having a high-level function allowing users to obtain items from more than one prefix is something that has long been asked for, one way or another. Yesterday I released ijson 3.3.0 with the latest master, and I have a few janitorial tasks I'd like to undertake before tackling this feature, which is something I think would be slightly popular indeed.

I'll keep updating this issue whenever I have something to share, but once again don't hold your breath.