Open chanced opened 10 months ago
In retrospect, I don't like the name Normal
. I also dont know if the Token
should be an enum
or if there should be a separate enum specifically for iteration. Having it be Cow
based would be neat, regardless, though.
@asmello mentioning you as you said you may look at Token
.
I'm going to dedicate some time to it myself but if you happen to work on it and have ideas or make progress, I'll happily concede it over or work with you on it.
I'm not sure I understand the motivation behind emitting a root token. Can you add some examples showing why that is useful? (I don't think I've run into any, or maybe I just forgot)
With regards to the enum variant names, I think Trunk
(or Stem
) might be an appropriate name for the internal components of the pointer. Or alternatively the type could be renamed to something like Component
or Segment
and that frees Token
as the variant name.
I'm going to dedicate some time to it myself but if you happen to work on it and have ideas or make progress, I'll happily concede it over or work with you on it.
Yeah, I'm going to play around a bit, it's an interesting design puzzle. Feel free to make your own attempt, regardless. Whether or not my experiments turn into a PR, they'll make me able to better to form an opinion over the final design.
Can you add some examples showing why that is useful?
This is from my previous attempt at a json schema compiler. This method is invoked when a non-root schema (one that has a json pointer fragment) is referenced from a schema currently being compiled but the referenced schema's root schema (uri sans fragment) has not. In order to properly compile the target schema, I have to walk up the path and queue ancestors of the target schema.
As the current iterator does not emit a token for the root, I had to use a while loop to work around it.
fn queue_ancestors(
&mut self,
target_uri: &AbsoluteUri,
q: &mut VecDeque<SchemaToCompile<K>>,
) -> Result<(), CompileError<L, K>> {
let mut path = Pointer::parse(&target_uri.fragment_decoded_lossy().unwrap())?;
q.push_front(SchemaToCompile {
key: None,
uri: target_uri.clone(),
parent: None,
continue_on_err: false,
ref_: None,
});
while !path.is_root() {
path.pop_back();
let mut uri = target_uri.clone();
if path.is_empty() {
uri.set_fragment(None)?;
} else {
uri.set_fragment(Some(&path))?;
}
if let Some(key) = self.schemas.get_key(&uri) {
let is_not_compiled = !self.schemas.is_compiled(key);
ensure!(is_not_compiled, SchemaNotFoundSnafu { uri });
continue;
}
q.push_front(SchemaToCompile {
key: None,
uri,
parent: None,
continue_on_err: true,
ref_: None,
});
}
let mut uri = target_uri.clone();
uri.set_fragment(None).unwrap();
ensure!(
!self.schemas.is_compiled_by_uri(&uri),
SchemaNotFoundSnafu { uri: target_uri }
);
Ok(())
}
At minimum, the Item
of the iterator should have been modeled after std::path::Component which has a RootDir
variant.
EDIT removed an unnecessary call in the example.
In hindsight, this example isn't great. I could have just used an iter here as I have to do work with the absolute (in this case meaning complete without fragment) uri anyway.
While the example I posted turned out not to be so great, I do think there's still need and perhaps even an expectation that when traversing a path, you have an opportunity to execute logic at the root / base.
It was an annoyance to me as a user to discover my loop body wasn't executing at the base path.
std::path::Component
is a great analog here, but I think the motivation for it may not apply to JSON Pointers, precisely. That's because, as defined in RFC 6901, pointers are always absolute. I believe std::path::Component
was introduced because filesystem paths sometimes are absolute, sometime not, and there was a need to make a distinction when decomposing them into components.
That doesn't mean we shouldn't provide a way to iterate over the components that treats the root as one. I can definitely see how that would be useful in the example you've shown. I'm just not convinced it should be the default iterator, because it does add some complexity in cases where the root token is not used (which I think may be relatively often?).
It's an interesting design question. There is sometimes a need to special case the first (or last) item in a sequence, and I've started a discussion once before with the Rust community about whether we should do something to help reduce related off-by-one mistakes. There were some interesting ideas proposed, but nothing that seemed like we should obviously add to std
.
So my instinct here is to keep Pointer::tokens
with the current semantics, and maybe introduce a separate Pointer::components
that produces the described enum.
Sold.
This doesn't add complexity to Token
and circumvents a major breaking change.
I'll handle that. Although it's not needed for your PR to get pushed out since it wont be breaking.
Thanks.
Currently
Poiinter::tokens
returns an iterator ofToken
, but in doing so omits the root. This is not ideal.Token
should be an enum consisting of two possible options,Root
andNormal
.A new wrapper type will be needed for the contents of
Normal
.