Open rocurley opened 2 years ago
@rocurley Why can you not produce a &str
(or similar). What the Index
trait expands to does need to be a reference, but I did not find it massively to be a problem so far. Do you have an example where it does not work for you currently?
So my actual use case is diffing jsons. But you don't need the full complexity of json here to have this issue. Say you want to diff Vec<Vec<u8>>
s. Now in principle you could just pass that in directly and be fine. But the diff will then consider the inner Vec<u8>
s to be the unit of diffing, so the diffs will be too coarse. [[1,2]] -> [[1,2,3]]
shouldn't say the entire input has changed, for example.
So one way to do this is to flatten the input into a vector representation that allows you to reconstruct the original input:
enum DiffElem {
VecStart,
Elem(u8),
}
fn to_diffable(xss : &Vec<Vec<u8>>) -> Vec<DiffElem> {
let mut out : Vec<DiffElem> = Vec::new();
for xs in xss {
out.push(DiffElem::VecStart);
out.extend(xs.iter().map(|x| DiffElem::Elem(*x)));
}
out
}
Now this will work, but requires making a copy of the entire input. But it doesn't seem strictly necessary. You could "index" into the diffable representation without ever constructing it:
// This implementation is ineffecient. To get a more effecient one, we'd
// annotate each subvector with the index of its VecStart, and use
// binary search to find the subvector we care about.
fn index_diffable(mut i : usize, xss : &Vec<Vec<u8>>) -> Option<DiffElem> {
for xs in xss {
if i == 0 {
return Some(DiffElem::VecStart);
}
if i < xs.len() {
return Some(DiffElem::Elem(xs[i]));
}
i -= xs.len();
}
None
}
But you can't do this with the Index
trait.
I agree that this case is not great. Generally I'm not a huge fan of the Index
trait as base. I have already considered alternatives before but never with too much success.
I've been trying to implement a wrapper type to get ropey::RopeSlice
to work with similar
using the iterators it provides, but the problem is that they only return owned values with the exception of the Chunks
iterator (which isn't what I need).
I have already considered alternatives before but never with too much success.
If you don't mind me asking, what did you try and why weren't they successful?
The original issue I had with Index
wasn't even that it returns references but that it can be hard to get the bounds checks reliably eliminated (#19). So originally I was trying to do that.
To address this particular thing with borrows I tried to have a custom trait but making both borrows and owned values work is surprisingly complex and I did not go very far with it because I did not at all like what it did to the API. I'm not sure if there are clever ideas I haven't been thinking of yet that would make both cases work.
We're running into this in Helix (using Ropey) again.
Might I suggest adding a variant of the capture_diff()
and capture_diff_deadline()
functions that take a |usize| -> Item
closure? That would give a lot of flexibility to client code to work with whatever weird containers/situations they might have.
Edit: or rather a |usize| -> Option<Item>
closure, since the collections are presumably not infinitely long, ha ha. :-)
@cessen the issue is that this Index
based API is all over the place. This goes really deep in the library.
I will play with it. Maybe I just break the internal algorithms API.
Ah, got it. If I get some time and the mood strikes, I might take a crack at that as well. If we're lucky, even if the API goes deep, it still might not be too hard to separate from the algorithm code itself. If we're lucky... :-)
The other issue with the callback API is that it can't borrow. So it's not even that the callback based API could be the base of the API.
Perhaps a macro that could create two versions of the internal functions; one that borrows and one that returns an owned value?
So one could move somewhat to improving this by going all in on the IdentifyDistinct
type. So something like this could be done:
let old = Rope::from_str("...");
let new = Rope::from_str("...");
let h = IdentifyDistinct::<u32>::new_from_iter(old.lines(), new.lines());
let diff = capture_diff(
Algorithm::Myers,
h.old_lookup(),
h.old_range(),
h.new_lookup(),
h.new_range(),
);
The new_from_iter
is easy to add, but the returned ops are still annoying to work with. For instance iter_changes
again requires an Index
bound.
So now that there are GATs that issue might be worth toying around with again.
I'm working on json diffing. I have some representation of two jsons in memory, like a
serde_json::Value
. Obviously these cannot be directly diffed, since they're not array-like. So to diff them I need to flatten them into an array of json-lines. I'd rather not actually construct the array, however: I don't want the memory usage. Instead, I'd like to generate the lines on the fly.I can almost do this by implementing the
Index
trait. It doesn't quite work, however, since I need to return a reference. Ideally, I'd like to be able to pass in anFnMut(usize) -> T
or something along those lines.I'm not sure if anybody but me will find this useful, TBH, but I figured there's no harm in asking.