cerbero90 / json-parser

šŸ§© Zero-dependencies lazy parser to read JSON of any dimension and from any source in a memory-efficient way.
MIT License
671 stars 11 forks source link

Error when parsing from iterable (array) #1

Closed shomisha closed 11 months ago

shomisha commented 11 months ago

I get a TypeError when I try to use the JsonParser on an array. The error happens on line 59 in \Cerbero\JsonParser\Tokens\Lexer because the Lexer assumes the $chunk variable will contain a string and calls strlen() upon it, however when parsing from a nested array the $chunk variable contains an array as well.

Basically, I think this is a bug. I'm open to the option that I'm doing something wrong, but I couldn't figure out what it could be and it feels like a bug. Or did I get all of this completely wrong and the iterable source should contain an array of JSON strings? šŸ˜…

Here is a chunk of code that should produce the bug I'm describing:

$data = [
    "pre" => [
        "locations" => [
            "locations" => [
                [
                    "id" => 2333,
                    "title" => "Hollywood",
                    "created_at" => "2022-10-02 10:17:12",
                    "updated_at" => "2022-10-02 10:17:12",
                    "deleted_at" => null
                ]
            ]
        ],
        "categories" => [
            "categories" => [
                [
                    "id" => 1302,
                    "business_id" => 186,
                    "title" => "Drama",
                    "deleted_at" => null,
                    "created_at" => "2019-02-10 06:28:28",
                    "updated_at" => "2020-02-08 05:16:52"
                ]
            ]
        ]
    ],
    "products" => [
        [
            "id" => 4465,
            "backoffice_title" => "Acting class",
            "created_at" => "2017-02-22 16:20:19",
            "updated_at" => "2023-06-30 10:25:59",
            "deleted_at" => null
        ]
    ],
    "post" => [
        "timeslots-4465" => [
            "timeslots" => [
                [
                    "id" => 250334,
                    "product_id" => 4465,
                    "created_at" => "2019-02-10 06:28:29",
                    "updated_at" => "2021-09-03 15:03:27"
                ]
            ]
        ]
    ]
];

(new JsonParser($data))->pointer('/pre/locations', fn ($data) => var_dump($data))->traverse();

Possible implementation

I'm guessing the iterable source should not be processed by the Lexer at all

Your environment

cerbero90 commented 11 months ago

Hi @shomisha and thanks for your report :)

JSON Parser is meant to parse JSONs, hence why any iterable is expected to hold JSON strings to parse.

In your case the JSON has already been decoded into an associative array, so the memory needed for the JSON decoding process has already been consumed.

If you are looking for saving memory while reading the JSON, you should avoid calling functions like json_decode() and use JSON Parser to lazily parse your JSON instead.

However, if you can only deal with an already initialized associative array, JSON Parser can't help you as the whole decoding process already happened.

Let me know if you still have doubts :)

shomisha commented 11 months ago

Thanks for the response, @cerbero90 :)

After reading the documentation once again I figured out that the array is supposed to hold JSON strings. On my first read I thought that if I would pass an array into JsonParser it would just proxy it and let me work with the array.

A little context on what's going on: I'm not calling json_decode, actually I'm working on a rather complicated import/export functionality that works with different data-structures, basically exporting Laravel models to JSON and then importing them back again. My approach is splitting the logic for importing/exporting different models into different commands, a dedicated command for each model. So what happens is all the models and their relationships are exported to a single JSON (again, using multiple commands calling each other). Then when importing happens, the first command that is called by the user parses the JSON and what I wanted to achieve is that as it parses the JSON it forwards passed data to sub-import commands. But the idea I had is that each command should be able to run both as a root command (i.e. parse the JSON on its own) and as a sub-commands (i.e. get called by another root command which is parsing the JSON and forwarding parsed data to the sub-command). For that scenario it would be ideal if JsonParser could handle both parsing (as root command) and just read arrays (as sub-command) and process them based on pointers because my logic wouldn't have to differentiate between data passed as an array and data passed through a JSON on the disk.

TL;DR I get JSONParser is not intended to be used as I expected, it was probably just wishful thinking while I was reading the docs :) Thanks again for the response and clarification!