resyncgg / json-stream

A library to parse Newline Delimited JSON values from a byte stream
Apache License 2.0
4 stars 1 forks source link

clarify use cases and tests for input data #1

Open tychoish opened 6 months ago

tychoish commented 6 months ago

I've been looking at this and related libraries for processing some JSON stream data, and I was explicitly looking for a tool that would handle non new-line delimited JSON. While this library says that it's for processing new-line json, there isn't any code that handles newline separations, and I believe from looking at the implementation that it should handle streams of multi-line json documents just fine.

It might be nice to add a few test cases and tweak some of the documentation to reflect this. Would you all be interested in this?

d0nutptr commented 6 months ago

Hey! I'm interested for sure but I'm pretty busy for the next week or so. I'll try to get back to this then :)

tychoish commented 6 months ago

I did some poking and testing at this over the weekend and even though this seems like it should work (which is the deserializer mode that we're using), it my tests it doesn't seem to work in my (high level) tests. The really confusing wording in the documentation is:

The data can consist of any JSON value. Values need to be a self-delineating value e.g. arrays, objects, or strings, or be followed by whitespace or a self-delineating value.

That seems to cover both newline and other concatenated values. 🤷🏻 My tests were pretty high level so some unittests might be good, I'll poke at that next as I have time.

tychoish commented 6 months ago

Ok, I posted that comment and #5 does resolve this.