johnezang / JSONKit

Objective-C JSON
6.21k stars 1.66k forks source link

Parsing JSON streams #30

Open scelis opened 13 years ago

scelis commented 13 years ago

What are your thoughts on adding support for parsing streams of JSON-encoded data? Would there be enough of a speed gain to warrant this type of parsing for webservices that return large amounts of data?

Thank you so much for this project. So far it has been both easy to use and incredibly fast.

johnezang commented 13 years ago

For many, possibly the majority, of use cases, I don't really see that adding streaming decoding would improve either speed (in terms of total time spent decoding), or memory consumption.

There is one usage case that stream can potentially be a benefit, though-

In other words, some kind of UI display of the JSON that is being received, and the time it would take for "all of the JSON" to be received exceeds some user detectable threshold.... and the JSON data needs to be in a form where you can immediately begin displaying something to the user with the first bits of data received, which would hopefully give the user something to look at / use while continuing to load the rest of the data in the background.

Off the top of my head, I'm not aware of too many JSON web services that really fit that criterial, though I would be very interested in hearing about any that do, and it would most likely motivate me to actually implement it. But since most JSON web services I'm aware of typically generate < 50K-100K (uncompressed) of JSON, the time it takes to download it for "most users" is a relatively small percentage of where the app spends it's time. And, if you're smart, the web server on the far end will gzip the data before it sends it over the network, and JSON tends to be highly redundant, so it compresses well- I commonly see 5x-6x reduction, which means that only 16K-20K needs to be sent over the network to transfer 100K of JSON. Even for clients who are on "low bandwidth" connections, such as some 3G wireless network users, it doesn't take long to transfer 16K-20K.

So I do think that such a feature would be genuinely useful to some users of JSONKit, I also think that they are in the minority.

There is also the issue of how to structure the API that makes the partially parsed JSON results available- I happen to really dislike "event driven" or "SAX" style parsers- you get called for every little thing like "I just parsed an Integer!" Every time I've been stuck with one of those types of APIs, and I needed to make use of / display the results immediately, it always required me to keep track of a lot of state of "where things were at", where "this thing" fit in to that current picture of the parsing state, and how "this thing" changed the current state so the "next thing" could be properly understood. All of this so I could stitch each tiny partial result back in to it's whole, and once I had stitched together a "useable chunk", then I could actually do something with it like update the GUI. Not only is doing things that way a total PITA, it's extremely error prone.

scelis commented 13 years ago

Thank you for your thoughtful reply. I think it might make most sense if I describe my current situation.

We are using JSONKit in the Expedia Hotels iPhone application (check out the "about this app" screen; we throw you some thanks). We have chosen to forgo traditional search result pagination for an enhanced user experience once the results are completely returned. This means being able to sort and filter the results completely in-memory, as well as being able to jump from the most expensive to the least expensive hotels without going back to the server fifteen times.

Unfortunately, this means that it can take up to 15 horrendous seconds (over 3G) to perform a search focused around a dense area. For example, take the worst-case scenario of New York City. When I perform a search for 1 guest looking for a room for tonight, I get the following results:

372 hotels
256 KB (compressed)
3422 KB (uncompressed)

We have been evaluating how we can improve the user experience with respect to slow searches. One possibility is to use a streaming parsing solution. Now I definitely think a large part of this particular problem is our particular web server as we only start getting the data back 3ish seconds before the data finishes transferring. But that does mean we could start to display results to the user three seconds earlier if we stick with the one-request route.

I can also see similar scenarios playing out with other applications in our space. For example, in our FlightBoard application we are retrieving up to a few hundred flights at once. We will most likely have similar concerns when and if we do flight and car rental bookings.

I definitely agree with your implementation concerns. SAX-style parsers are a pain to use. I also agree with you that array-style results are the ones that will benefit most from streamed decoding (with an occasional solitary value here and there). The problem is that the array might not be the top-level object depending on the web service. For example, The top level object might be a dictionary containing a few key/value pairs, one of which is the array. Or it might be three levels deep in some one-element arrays (yeah, contrived, but I have seen really poor JSON schemas in the past). One idea I had would be to specify which object(s) you want using keyPath constructions. For example, any of the following would be acceptable keyPaths:

@"someDictionary.sizeAsInteger" (an integer, notify caller when it has parsed)
@"objects[0].list" (an array, notify caller for each item in the array)
@"[0].someArray[1].humongousArray" (an array, notify caller for each item in the array)

Ideally, the caller would be able to specify multiple keyPaths and then be notified when an object in any of the keyPaths has finished parsing (or when parsing has totally finished). This would be much less error-prone and require a lot less state-management. It would also seem to me to be a cleaner, simpler API, though harder to implement.

johnezang commented 13 years ago

Well, you definitely fall in to the class of JSONKit users who would benefit from being able to get the parsed and decoded results while the stream is still being received. :)

As for JSON keyPaths, the though has occurred to me. That's one of the reasons why I haven't implemented Call backs for Deserializing yet- ideally I would like to have some way where you could specify the JSON keyPaths that you'd like to be able to do "deserialization substitution and swizzling magic" on. But how to do that part "right" isn't obvious, though JSONPath seems to be a good starting point.

For getting streaming results, though, I don't think anything too terribly complicated is required. I think the following covers the most of the cases where being able to get streaming results is the most useful:

Does that seem reasonable? If so, I'm sort of thinking along the lines of:

So far, so good. What are you using to get the JSON data from the server asynchronously? The stock NSURL* stuff, or something like ASIHTTPRequest?

For the interface between the JSONDecoder and your delegate object, I'm thinking something along the lines of:

-(BOOL)jsonDecoder:(JSONDecoder *)decoder shouldStreamResultsForArrayAtJSONPath:(NSString *)jsonPath;

The idea is that every time the decoder sees a [, it will ask the delegate "Do you want me to stream you the objects in this JSON Array that has a JSON keyPath of $.longListOfResults?" That keyPath is just an example, but there would clearly be some kind of "obvious" JSON keyPath that you could use to select which part of the JSON you wanted to get streaming results for. You would return either a YES or NO depending on if that was the keyPath you wanted.

Once you say YES, then as each item in that array is parsed, the decoder would give you the decoded results:

-(void)jsonDecoder:(JSONDecoder *)decoder didParseResult:(id)jsonResult atIndex:(NSUinteger)jsonArrayIndex;

Again, I'm making this up as I type it, so... :) Once the decoder has started streaming results to you, it will no longer ask you shouldStreamResultsForArrayAtJSONPath:- it will continue to give you completed results until that JSON Array finishes with a ]. But once that array has been parsed, the decoder goes back to asking you shouldStreamResultsForArrayAtJSONPath: every time it sees another [ in the JSON being streamed.

That's pretty much it. It's obviously tailored around the assumption that the JSON that will be streamed will contain a [ ... ] array of results, and that there is some "base" or "root" array that can be used to cleanly divide up the incoming stream of JSON in to useable chunks.

Thoughts?

scelis commented 13 years ago
  • The results that you are interested in are in an [ ... ] array somewhere in the JSON. That is to say, there's one particular keyPath that ends in an array, and it represents the bulk of the information you're interested in, and the results come one right after another.
  • You're really only interested in the results in that one particular array, and you're only interested in getting a parsed result when that item in the array has been completely parsed. In other words, you're interested in getting the result as soon as the parser hits the ,, separating that result from the next one.

I would disagree with the assumption that there would only ever be one piece of information in the results that a user would be interested in, and that this one piece of information would always be an array. I could definitely envision a scenario where there are N "interesting" arrays in the JSON response. In addition, I could imagine a response that has a couple arrays that are "interesting", as well as a couple high-level strings, dates, or numbers. For example:

{
    "sessionKey" : "1468270974",
    "numRestaurants" : 250,
    "numHotels" : 200,
    "restaurants" : [ ... ],
    "hotels" : [ ... ]
}

I would argue that each piece of information above is important, and could be used in an application before the result is fully parsed, especially when small values like sessionKey, numRestaurants, and numHotels come earlier in the response than the large arrays.

So to me, it would make sense to generalize the streamed decoding support to work for all JSON types including NSArray, NSDictionary, NSNumber, and NSString, and for a consumer of the API to be able to pass in N JSON keyPaths that they wish to consume.

  • You instantiate a JSONDecoder object.
  • You set a streaming delegate for the decoder.
  • You "do some stuff" to wire up the decoder so it gets the JSON data in chunks as it's received.

That makes sense to me! Though I would add an additional step between (2) and (3) to help accommodate my points above:

For example the JSONDecoder object could have the following method:

- (void)setInterestingJSONPaths:(NSArray *)paths;

This would actually make the delegate protocol simpler as it would obviate the need for the jsonDecoder:shouldStreamResultsForArrayAtJSONPath: method you mentioned. One less delegate method is always good in my mind, especially if it's a method that could be called many times during parsing and thus slow down the parsing process in general.

And then, this would change your other delegate method from:

- (void)jsonDecoder:(JSONDecoder *)decoder didParseResult:(id)jsonResult atIndex:(NSUinteger)jsonArrayIndex;

to:

- (void)jsonDecoder:(JSONDecoder *)decoder didParseResult:(id)jsonResult atJSONPath:(NSString *)path

This method would only be called for "interesting" keyPaths. Also, I don't actually think the index parameter is necessary for arrays since the consumer will likely just be appending this result to an NSMutableArray, anyway. And if they are doing something different they can always keep a counter around themselves.

So far, so good. What are you using to get the JSON data from the server asynchronously? The stock NSURL* stuff, or something like ASIHTTPRequest?

I am using stock NSURLRequest and NSURLConnection objects. So I would envision calling something like the following JSONDecoder method over and over again in the NSURLConnection callbacks:

- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data
{
    [_jsonDecoder decodeAdditionalData:data];
}

Thoughts? :)

johnezang commented 13 years ago

Don't get me wrong, I agree with a lot of what you're saying. In fact, I'd probably say that what you're suggesting is somewhat close to "ideal". Not only that, but everything discussed so far is practically identical to the issues and needs for doing good Call backs for Deserializing- you probably want to be able to specify a JSON keyPath to some part of the JSON that has a date in the form of /Date(###)/, can JSONKit do a callback to you when it sees that JSON keyPath, allow you to inspect the parsed value, and return a different object, say a NSDate.

The reason why I haven't done that for callbacks yet is the same problem I have here: Doing all that cool JSON keyPath matching stuff isn't straightforward. Well, straightforward isn't quite the right word- being able to do it extremely efficiently and extremely fast (where fast == O(1)) isn't straightforward.

There's also issues for what the JSON path syntax should look like. JSONPath looks reasonable, but which parts should be implemented? JSONPath allows you to do things like $..book[?(@.price<10)], which says "All books that have a price < 10". I could see how that'd be useful, but it makes things much more complex.

Even a simple hierarchal path is fairly complicated to do quickly- most of the time, every single {"key": value} key that is parsed will have to be checked to see if it matches any of the key paths of interest. This probably means I need to come up with a complicated multi-level, multi-way hash tree like data structure, and all of the "JSON keyPaths of interest" would have to be parsed and entered in to this data structure. Every key that appeared in the JSON would then have to be checked against this data structure. In order to do it efficiently, the JSON path built up so far would be kept and used to select only the relevant parts of the data structure that need to be checked to see if this currentPath + key matches. Ideally, all of this will be ~O(1) per lookup operation.

"It's complicated." It's also hard. Which is why I haven't gotten around to it. :)