tcbrindle / flux

A C++20 library for sequence-orientated programming
https://tristanbrindle.com/flux/
Boost Software License 1.0
531 stars 31 forks source link

default constructed `cursor_type` of `range_sequence` has nullptr for `boost::container::stable_vector<T>` #214

Open SidneyCogdill opened 3 weeks ago

SidneyCogdill commented 3 weeks ago

flux::distance(seq, from, to) requires two cursors. I was adapting flux in a function that returns the index of a element and I lazily wrote flux::distance(seq, {}, cursor) in the hope that the sequence access function would treat the default constructed cursor as pointing to the first element of the sequence:

const auto seq = flux::from_range(m_parentItem->m_childItems);
const auto cursor = decay_copy(seq)...; // some codes that finds a certain element

if (!f::is_last(seq, cursor)) {
    return boost::numeric_cast<int>(flux::distance(seq, {}, cursor));
}

Turns out this immediately triggers ASan: AddressSanitizer: access-violation on unknown address 0x000000000000. I had to write this instead:

flux::distance(seq, decay_copy(seq).first(), cursor)

The behavior is inconsistent with contiguous sequence:

auto v = std::vector{1, 2, 3};
auto cursor = flux::ref(v).find(2);
auto d = flux::distance(v, {}, cursor); // d == 1

The cursor type of contiguous sequences is index_t (it was set to std::ptrdiff_t in my case), and default constructed index_t is 0, which means pointing to the first element.


I guess the problem is that a nullptr iterator has no information about which range it belongs to. So either the cursor needs to become an optional type (default constructed cursor consistently means "nothing"), or there needs to be a special path (if the cursor type is an iterator wrapper) to seek to the first position when the iterator is nullptr (default constructed cursor consistently means the first element).

Or the two could be combined: a default constructed cursor is nullopt, and sequence access functions would fallback to the first element if it's nullopt (default constructed cursor consistently means "nothing", the "from" cursor becomes an optional parameter).

Introducing an optional type could have performance impact. So that's also a thing to consider.

SidneyCogdill commented 3 weeks ago

Note that the null pointer dereference here is not an issue of flux, but instead that the iterators of the container type I use being unsafe. It would have been trivial to add null check for the iterator type and have minimal performance impact. But still, this is an issue that the user could encounter when using flux::from_range with other containers.

tcbrindle commented 3 weeks ago

Flux requires that cursors of multipass sequences are default constructible in order to support "delayed initialisation" -- i.e. default constructing a cursor and only later setting it to a valid value. However, it is not required that a default constructed cursor represents a valid position for any particular sequence.

For things like std::vector, it just so happens that a default-constructed cursor has the same value as a cursor initialised by a call to first(). But this is just a special case; in general, we cannot assume that a default-constructed cursor has any particular value.

For example, Flux provides the slice(seq, from, to) operation which returns a flux::subsequence<Seq>. A subsequence uses the same cursor type as its parent Seq. However, calling first() on a subsequence returns the saved from cursor, which is very likely to be different from the zero value you'd get from default constructing its cursor type -- and in fact a default constructed cursor is unlikely to be a valid position for a subsequence.

Originally I did consider whether it would be better to not require default constructibility for multipass cursors, and instead use optional<Cursor> for delayed initialisation. It turns out that this just makes things more awkward for people who want delayed initialisation, and potentially adds a little overhead, but doesn't really make things safer in any meaningful way -- invalid cursor positions can still occur in other ways, and sequences still have to deal with them. In particular, examples like flux::distance(seq, {}, cursor) could still occur even if we dropped the default_constructible requirement.

tcbrindle commented 3 weeks ago

(Also, FYI, Flux has flux::copy(x) as an alternative to auto{x}, and flux::num::checked_cast is an alternative to boost::numeric_cast for integers that might save you a dependency)