microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
10.04k stars 1.47k forks source link

`<xutility>`: Evil member `element_type` can break non-ranges algorithms since C++20 #4436

Open frederick-vs-ja opened 6 months ago

frederick-vs-ja commented 6 months ago

Describe the bug

Currently, MSVC STL defines _Iter_value_t in term of iter_value_t since C++20 and uses it in non-ranges algorithms.

https://github.com/microsoft/STL/blob/8b081e26ba016970ce3338cb483ff10f2ade30a5/stl/inc/xutility#L1167-L1168 https://github.com/microsoft/STL/blob/8b081e26ba016970ce3338cb483ff10f2ade30a5/stl/inc/xutility#L1177-L1178

Per [readable.traits], if a type I has mismatched member types value_type and element_type, then iter_value_t<I> is ill-formed by default (when there's no program-defined indirectly_readable_traits or iterator_traits specializations), and thus I doesn't satisfy any C++20 `meow_iteratorconcept. However,Ican still meet _Cpp17MeowIterator_ requirements because the legacy requirements don't considerelement_type`, so such strategy sometimes rejects valid programs.

For example, this example should be valid since C++17. But MSVC STL starts to reject it since C++20 while other implementations don't (Godbolt link).

#include <algorithm>
#include <cassert>
#include <cstddef>
#include <iterator>
#include <type_traits>

template <class T>
struct evil_contiguous_iterator {
    T* ptr;

    template <class CT = const T, std::enable_if_t<!std::is_const_v<T>, int> = 0>
    constexpr operator evil_contiguous_iterator<CT>() const noexcept {
        return {ptr};
    }

    using value_type        = std::remove_cv_t<T>;
    using reference         = T&;
    using pointer           = T*;
    using difference_type   = std::ptrdiff_t;
    using iterator_category = std::random_access_iterator_tag;

    using element_type = void; // This is EVIL but shouldn't affect non-ranges algorithms.

    friend constexpr bool operator==(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr == j.ptr;
    }
    friend constexpr bool operator!=(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr != j.ptr;
    }
    friend constexpr bool operator<(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr < j.ptr;
    }
    friend constexpr bool operator>(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr > j.ptr;
    }
    friend constexpr bool operator<=(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr <= j.ptr;
    }
    friend constexpr bool operator>=(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr >= j.ptr;
    }

    constexpr evil_contiguous_iterator& operator++() noexcept {
        ++ptr;
        return *this;
    }
    constexpr evil_contiguous_iterator operator++(int) noexcept {
        auto that = *this;
        ++*this;
        return that;
    }
    constexpr evil_contiguous_iterator& operator--() noexcept {
        --ptr;
        return *this;
    }
    constexpr evil_contiguous_iterator operator--(int) noexcept {
        auto that = *this;
        --*this;
        return that;
    }

    constexpr evil_contiguous_iterator& operator+=(difference_type n) noexcept {
        ptr += n;
        return *this;
    }
    constexpr evil_contiguous_iterator& operator-=(difference_type n) noexcept {
        ptr -= n;
        return *this;
    }

    friend constexpr evil_contiguous_iterator operator+(evil_contiguous_iterator i, difference_type n) noexcept {
        return {i.ptr + n};
    }
    friend constexpr evil_contiguous_iterator operator+(difference_type n, evil_contiguous_iterator i) noexcept {
        return {i.ptr + n};
    }
    friend constexpr evil_contiguous_iterator operator-(evil_contiguous_iterator i, difference_type n) noexcept {
        return {i.ptr - n};
    }
    friend constexpr difference_type operator-(evil_contiguous_iterator i, evil_contiguous_iterator j) noexcept {
        return i.ptr - j.ptr;
    }

    constexpr T& operator*() const noexcept {
        return *ptr;
    }
    constexpr T& operator[](difference_type n) const noexcept {
        return ptr[n];
    }
    constexpr T* operator->() const noexcept {
        return ptr;
    }
};

int main() {
    int a[]{4, 2, 1, 7, 2, 9};
    std::sort(evil_contiguous_iterator<int>{std::begin(a)}, evil_contiguous_iterator<int>{std::end(a)});
}

Expected behavior

This example compiles in C++20/23 modes as in C++17.

STL version

Probably every version from 1e8b8d4eef4b2dddeb7533c5231c876383bd0ea6 to 8b081e26ba016970ce3338cb483ff10f2ade30a5.

Additional context

WG21-P2408R5 complicated the situation.

frederick-vs-ja commented 6 months ago
Incorrect attempt

IMO a "fix" could be implementing `_Iter_value_t` like the following since C++20. ```C++ template struct _Iter_value_fallback : indirectly_readable_traits<_Iter> {}; template <_Has_member_value_type _Iter> struct _Iter_value_fallback<_Iter> : _Cond_value_type {}; template <_Has_member_element_type _Iter> requires (!_Has_member_value_type<_Iter>) struct _Iter_value_fallback<_Iter> : _Cond_value_type {}; template using _Iter_value_t = conditional_t<_Is_from_primary>>, _Iter_value_fallback>, iterator_traits>>::value_type; ```

Perhaps we can add an internal typedef name to this specialization for workaround: https://github.com/microsoft/STL/blob/8b081e26ba016970ce3338cb483ff10f2ade30a5/stl/inc/__msvc_iter_core.hpp#L153-L155

StephanTLavavej commented 6 months ago

We talked about this at the weekly maintainer meeting, although @CaseyCarter was out sick so our analysis powers were limited. This was introduced by #329. Switching ugly _Iter_value_t to be C++20 iter_value_t with different semantics appears to be a bug for the reasons you've explained, but #329 said that this was to "enable unwrapping of C++20 move-only single-pass iterators". It appears that many uses of ugly _Iter_value_t can't possibly be handling such C++20 move-only single-pass iterators (because the uses are in stable_sort() etc. where stronger iterators are required).

We believe that a full audit of all ugly _Iter_value_t usage will be necessary, to determine which ones:

In general, we should avoid having aliases that switch their semantics based on Standard mode, except when explicitly commented/known that their downlevel behavior is a "best effort" approximation with no harmful downsides (e.g. <xutility>'s _Iterator_is_contiguous is ok).

frederick-vs-ja commented 5 months ago

We believe that a full audit of all ugly _Iter_value_t usage will be necessary, to determine which ones:

  • Should always expand to iterator_traits::value_type
  • Should have this behavior of conditionally switching between iterator_traits::value_type and C++20 iter_value_t
  • Should always be C++20 iter_value_t

Per WG21-P2248R7, it seems that the value type should be consistently iterator_traits::value_type for non-ranges algorithms. Perhaps conditionally switching is only applicable to move_iterator and reverse_iterator's value_type.

frederick-vs-ja commented 4 months ago

Filed LWG-4080 for this.