halaxa / json-machine

Efficient, easy-to-use, and fast PHP JSON stream parser
Apache License 2.0
1.1k stars 65 forks source link

Problem With `getCurrentJsonPointer()` #105

Open XedinUnknown opened 1 year ago

XedinUnknown commented 1 year ago

Hi!

The Problem

Questions

  1. Could this be a bug?
  2. Is there a better way to achieve this?
  3. If not, would you consider changing the default behaviour of yielded keys to be the full path?

Thank you!

XedinUnknown commented 1 year ago

Seems to be connected to $iteratorLevel, which depends on the configured JSON pointers: no pointers - no path.

I don't know the codebase well, and so I wonder: perhaps, the $currentLevel needs simply to be greater than 0, and not depend on $iteratorLevel at all?

XedinUnknown commented 1 year ago

Digging further, I understand why you probably wouldn't want the iterable to yield full paths as keys: each structure, including the top-most one, should probably have original keys, such that when converted to an array with e.g. iterator_to_array() it retains its key structure. Also, it's trivial to wrap this into a generator that yields getCurrentJsonPointer() instead of keys, if this is what's needed.

So, currently the only problem with this is that apparently, when not specifying pointers, there's never any current path/pointer to retrieve. To my mind, the relationship between pointers and the path should not exist at all: the path should be available regardless of absence or presence of any pointers, because each value in a JSON document has a path.

XedinUnknown commented 1 year ago

By the way, here's what I mean by wrapping Items in a Generator:

        // For each item, yield its full path instead of just the key
        return call_user_func_array(function (Items $data): Generator {
            foreach ($data as $key => $value) {
                $path = $data->getCurrentJsonPointer();
                // Path may be empty if all-matching pointer '' was specified
                // https://github.com/halaxa/json-machine/issues/105
                $newKey = !empty($path)
                    // Remove root prefix to make working with key easier, especially if top-level
                    ? ltrim($path, '/')
                    : $key;
                yield $newKey => $value;
            }
        }, [$data]);
halaxa commented 1 year ago

Tests of getCurrentJsonPointer() seem to be ok. Can you look at them and check your code for possible flaws?

halaxa commented 2 weeks ago

Is this still relevant, @XedinUnknown?

XedinUnknown commented 2 weeks ago

Heya! It's been a while, and I don't remember what the problem was or how/whether I went around it.

Reading this thread, though, gave me a thought: just because tests are passing, doesn't mean there are tests that cover what I'm talking about.

Hope that helps 🙏

halaxa commented 2 weeks ago

So, currently the only problem with this is that apparently, when not specifying pointers, there's never any current path/pointer to retrieve. To my mind, the relationship between pointers and the path should not exist at all: the path should be available regardless of absence or presence of any pointers, because each value in a JSON document has a path.

The "tests work" was a reaction to this. But if one doesn't specify a JSON pointer, the main level is iterated and thus a pointer to the current collection is the same - an empty string. The relationship is natural. getCurrentJsonPointer() is there for cases of multiple pointers for you to know which one is being iterated and also for wildcard pointers (/results/-) so you know when you iterate /results/0 or /results/99. But it will always be closely tied to the provided pointers. But I may be missing something.