Open user706 opened 1 year ago
I simply mean: split the copying into max. 2 fast, simple linear fragments:
If idx_begin <= idx_end
[ 0, 1, 2, idx_begin, 4, 5, 6, 7, 8, idx_end, ... ]
To copy from cbegin to cend:
- copy from idx_begin to idx_end
( -> has 1 fragment)
If idx_end < idx_begin
[ 0, 1, 2, idx_end, idx_begin, 98, 99, 100] idx_lin_end
To copy from cbegin to cend:
- copy from idx_begin to idx_lin_end (excl)
- copy from 0 to idx_end
( -> has 2 fragment2)
Thanks for bringing this to my attention. Will look into it, likely in the weekend.
By the way... I'm a bit intregued by the iterators.
At first I thought there is a bug here: https://github.com/martinmoene/ring-span-lite/blob/b8135cd85499a2d5482e048eb8fb6b354350d427/include/nonstd/ring_span.hpp#L982 ?
Basically: What if you increment beyond the unsigned max and get wrap around?
But that can never happen, right? (Because the iterator protocol says: iter somewhere in the range, or one beyond the range.)
And also interesting: a begin() iterator that you store to a variable
auto it = rs.begin();
"moves along" with push_backs to rs.
And here's some "fast" code for you to experiment with:
template<typename OutputIt>
OutputIt copy(const const_iterator &beg, const const_iterator &end, OutputIt d_first ) {
assert(this == beg.m_rs);
assert(this == end.m_rs);
if (beg == end) { // necessary, since otherwise below, would result in b == e, which is not distinguishable from a completely full ring
return d_first;
}
auto b = m_data + normalize_(m_front_idx + beg.m_idx);
auto const e = m_data + normalize_(m_front_idx + end.m_idx);
if (e <= b) {
auto const edge = m_data + m_capacity;
d_first = std::copy(b, edge, d_first);
b = m_data;
}
return std::copy(b, e, d_first);
}
only 2 calls to normalize_
And this, when copying from ring_span into ring_span:
public:
template<typename TOther>
typename ring_span<TOther>::iterator copy(const const_iterator &beg, const const_iterator &end, ring_span<TOther> &other, typename ring_span<TOther>::iterator d_first ) {
assert(this == beg.m_rs);
assert(this == end.m_rs);
assert(&other == d_first.m_rs);
assert(end - beg <= other.size());
if (beg == end) { // necessary, since otherwise below, would result in b == e, which is not distinguishable from a completely full ring
return d_first;
}
auto b = m_data + normalize_(m_front_idx + beg.m_idx);
auto const e = m_data + normalize_(m_front_idx + end.m_idx);
auto b_o = other.m_data + normalize_(other.m_front_idx + d_first.m_idx);
if (e <= b) {
auto const edge = m_data + m_capacity;
copy_to_lim(b, edge, b_o, other);
b = m_data;
}
copy_to_lim(b, e, b_o, other);
return std::next(d_first, end - beg);
}
private:
template<typename TOther>
void copy_to_lim(pointer b, const pointer lim, pointer &b_o, ring_span<TOther> &other) {
auto const numTillLim = lim - b;
auto const numTillEdge_o = other.m_data + other.m_capacity - b_o;
if (numTillLim > numTillEdge_o) {
std::copy(b, b + numTillEdge_o, b_o);
b += numTillEdge_o;
b_o = other.m_data;
}
b_o = std::copy(b, lim, b_o);
}
I just hope there's no bug, in there. haha
This sounds to me like a case of "piecewise contiguous iterators," which also comes up with std::deque
.
There's no Standard-blessed way to mark an iterator as "piecewise contiguous" (and no API for what to do next, even if it is). But I know @philnik777 has recently done some optimizations in libc++ for deque iterators; it might be good to take a look at those and see whether it's possible to adapt the shape of them into something ring_span
could also either (1) duplicate, or (2) hook into (in a non-standard way, to make the optimization available to libc++'s std::copy
).
FWIW, simply adding a ring_span::copy
member function feels unsatisfactory to me. Is the problem that std::copy
itself is slow? or is the problem that one of ring_span
's "special skills" is that you can blat out all its data into a contiguous buffer really quickly, and yet we don't expose that skill in the public API? ring_span::copy
solves the latter but not the former.
Consider whether the right API might be (extremely bikesheddable names):
if (!rs.is_contiguous()) {
rs.rotate_until_contiguous();
}
std::copy(&*rs.begin(), &*rs.end(), out_it);
and whether "rotate" is even possible, if you have a custom popper that you don't know what it does. I haven't thought about this at all. Or alternatively, probably better:
std::copy_n(rs.firsthalfdata(), rs.firsthalfsize(), out_it);
std::copy_n(rs.secondhalfdata(), rs.secondhalfsize(), out_it);
FWIW I can't recommend playing around in implementation-details, but this indeed looks like it could be made a segmented iterator. If you really want to, you can look at __iterator/segmented_iterator.h
and specialize __segmented_iterator_traits
for your iterator.
Thanks for those pointers. Looks like I'll roll my own custom ring, to be honest.
Reasons:
a) I won't invest in segmented_iterator as I'm not on libc++, but on libstdc++.
b) Also, I think that the current ring here, does not fullfill all my criteria.
I want indices or iterators into the ring, that remain fixed... that don't "magically" shift with push_back.
Related: Personally I think that I don't agree with the design decision made here, of having iterators do magic on deferences (the obtained value is shifted along).
For me *it
should always point at the same element in the underlying array.
That's safer, gives the opportinity to avoid locking, etc.
Look what happens in this code: https://github.com/martinmoene/ring-span-lite/blob/b8135cd85499a2d5482e048eb8fb6b354350d427/include/nonstd/ring_span.hpp#L966
You call at_
which calls m_data[ normalize_(m_front_idx + idx)
on the dereference.
I would change that, to have normalize_
in the call-path of iterator-increments etc., and not in the dereference.
Why do I say all this? Well... Here's my usecase:
Image a relativly large ring (many items). I want a writer, and a delayed reader, who can at his leisure read blocks (copy them), (also hopefully without needing all that much locking).
Currently there's no proper index or iterator mechanism, to do this; as currently push_backs, would cause dereferences to shift along ("behind my back" -- I'm joking ... this is all design decision and usecase stuff.)
Yes, there is a potential problem with my approach, namely that too slow reads, will cause the writer to overwrite ranges that the slow reader has not yet read. That's particular to my usecase and I need to work with code, to come up with a good solution.
so far with ...my thoughts...
Thanks for providing insight into this background :)
Is this not "slow(ish)", when using the standard copying mechanisms:
??
It'll call this unnecessary code on each iterator-dereference: https://github.com/martinmoene/ring-span-lite/blob/b8135cd85499a2d5482e048eb8fb6b354350d427/include/nonstd/ring_span.hpp#L968 which calls the slowish https://github.com/martinmoene/ring-span-lite/blob/b8135cd85499a2d5482e048eb8fb6b354350d427/include/nonstd/ring_span.hpp#L836 on each dereference!!!
Should
ring
andring_span
not have some facilities to speed up copying?