chanced / jsonptr

JSON Pointer (RFC 6901) implementation for Rust
Apache License 2.0
44 stars 4 forks source link

Add `components` method to `Pointer` which returns an iterator that includes the root. #17

Open chanced opened 10 months ago

chanced commented 10 months ago

Currently Poiinter::tokens returns an iterator of Token, but in doing so omits the root. This is not ideal.

Token should be an enum consisting of two possible options, Root and Normal.

A new wrapper type will be needed for the contents of Normal.

chanced commented 1 month ago

In retrospect, I don't like the name Normal. I also dont know if the Token should be an enum or if there should be a separate enum specifically for iteration. Having it be Cow based would be neat, regardless, though.

@asmello mentioning you as you said you may look at Token.

I'm going to dedicate some time to it myself but if you happen to work on it and have ideas or make progress, I'll happily concede it over or work with you on it.

asmello commented 1 month ago

I'm not sure I understand the motivation behind emitting a root token. Can you add some examples showing why that is useful? (I don't think I've run into any, or maybe I just forgot)

With regards to the enum variant names, I think Trunk (or Stem) might be an appropriate name for the internal components of the pointer. Or alternatively the type could be renamed to something like Component or Segment and that frees Token as the variant name.

I'm going to dedicate some time to it myself but if you happen to work on it and have ideas or make progress, I'll happily concede it over or work with you on it.

Yeah, I'm going to play around a bit, it's an interesting design puzzle. Feel free to make your own attempt, regardless. Whether or not my experiments turn into a PR, they'll make me able to better to form an opinion over the final design.

chanced commented 1 month ago

Can you add some examples showing why that is useful?

This is from my previous attempt at a json schema compiler. This method is invoked when a non-root schema (one that has a json pointer fragment) is referenced from a schema currently being compiled but the referenced schema's root schema (uri sans fragment) has not. In order to properly compile the target schema, I have to walk up the path and queue ancestors of the target schema.

As the current iterator does not emit a token for the root, I had to use a while loop to work around it.

    fn queue_ancestors(
        &mut self,
        target_uri: &AbsoluteUri,
        q: &mut VecDeque<SchemaToCompile<K>>,
    ) -> Result<(), CompileError<L, K>> {
        let mut path = Pointer::parse(&target_uri.fragment_decoded_lossy().unwrap())?;

        q.push_front(SchemaToCompile {
            key: None,
            uri: target_uri.clone(),
            parent: None,
            continue_on_err: false,
            ref_: None,
        });
        while !path.is_root() {
            path.pop_back();
            let mut uri = target_uri.clone();
            if path.is_empty() {
                uri.set_fragment(None)?;
            } else {
                uri.set_fragment(Some(&path))?;
            }
            if let Some(key) = self.schemas.get_key(&uri) {
                let is_not_compiled = !self.schemas.is_compiled(key);

                ensure!(is_not_compiled, SchemaNotFoundSnafu { uri });
                continue;
            }

            q.push_front(SchemaToCompile {
                key: None,
                uri,
                parent: None,
                continue_on_err: true,
                ref_: None,
            });
        }
        let mut uri = target_uri.clone();
        uri.set_fragment(None).unwrap();

        ensure!(
            !self.schemas.is_compiled_by_uri(&uri),
            SchemaNotFoundSnafu { uri: target_uri }
        );
        Ok(())
    }

At minimum, the Item of the iterator should have been modeled after std::path::Component which has a RootDir variant.

EDIT removed an unnecessary call in the example.

In hindsight, this example isn't great. I could have just used an iter here as I have to do work with the absolute (in this case meaning complete without fragment) uri anyway.

chanced commented 1 month ago

While the example I posted turned out not to be so great, I do think there's still need and perhaps even an expectation that when traversing a path, you have an opportunity to execute logic at the root / base.

It was an annoyance to me as a user to discover my loop body wasn't executing at the base path.

asmello commented 1 month ago

std::path::Component is a great analog here, but I think the motivation for it may not apply to JSON Pointers, precisely. That's because, as defined in RFC 6901, pointers are always absolute. I believe std::path::Component was introduced because filesystem paths sometimes are absolute, sometime not, and there was a need to make a distinction when decomposing them into components.

That doesn't mean we shouldn't provide a way to iterate over the components that treats the root as one. I can definitely see how that would be useful in the example you've shown. I'm just not convinced it should be the default iterator, because it does add some complexity in cases where the root token is not used (which I think may be relatively often?).

It's an interesting design question. There is sometimes a need to special case the first (or last) item in a sequence, and I've started a discussion once before with the Rust community about whether we should do something to help reduce related off-by-one mistakes. There were some interesting ideas proposed, but nothing that seemed like we should obviously add to std.

So my instinct here is to keep Pointer::tokens with the current semantics, and maybe introduce a separate Pointer::components that produces the described enum.

chanced commented 1 month ago

Sold.

This doesn't add complexity to Token and circumvents a major breaking change.

chanced commented 1 month ago

I'll handle that. Although it's not needed for your PR to get pushed out since it wont be breaking.

Thanks.