Open 1plaintext opened 3 months ago
Hi @1plaintext,
This is indeed a popular ask, see #2072, #1723, #1486, #1316, #1708. I like the idea of a callback-based filter, but it was impossible to do with v6, so I never added it to the backlog.
Best regards, Benoit
I would have thought this be a popular ask, but I didnt seem to find any discussion/solution I am trying to absolutely minimize memory usage because the document has a huge array, the doc looks like this {\<unpredictable stuff>, "array":[{"name":\<long strings>},...]}} using a filter like this filter["array"][0]["name"] = true; I am still running out of memory, because, with this filter, the final product still contains all the "name"s (again, this is a very long array)
I wonder if there are other ways to more concisely filter... perhaps something like this (doesn't seem to work) filter["array"][3-14]["name"] = true; so I get only forth to fifteenth elements? 16 elements fits perfectly in memory vs all the elements in my case.
An alternative idea maybe a callback filter["array"][]["name"] = callback; so i can examine each time a "name" is found, perhaps the call back tells me the index and lets me know the "name", so I can decide to keep it, throw it away, or even stop the parsing (for speed purpose)
Now I also thought about "deserialization-in-chunks" using findUntil.. but the preceding "\<unpredictable stuff>" makes it unreliable, there maybe similar named elements at different nested level etc (unless I write my own JSON parsing code, which defeats the purpose of using this library)- after all, the idea of an annotated JSON doc is so that the doc can be out of order, with additional things you don't care, etc...
Thanks for any idea.