jbeder / yaml-cpp

A YAML parser and emitter in C++
MIT License
5.16k stars 1.86k forks source link

iterator_base::value_type should not be const #1326

Open horenmar opened 1 week ago

horenmar commented 1 week ago

I had an issue open against my own project, that boils down to yaml-cpp providing iterators, whose value_type is const. This should not happen, values are immutable by definition*, and it causes issues in consuming code that doesn't expect this.

From a quick glance, I think it is fixable just by adding std::remove_const_t here.

* this gets a bit muddy with reference-like types, but there it is generally expected that it is the reference that cannot be modified, not the referenced thing.

KitsuneRal commented 1 week ago

Values are immutable when they have value semantics. The "values" returned by iterator_base::operator*() don't carry value semantics (fortunately or unfortunately), because the underlying Node objects are not semantically values, they are "smart references", or handles, of sorts. To see that, build yaml-cpp with your remove_const_t suggestion and then play with changing two copies of the same "value" dereferenced from the iterator.

horenmar commented 1 week ago

The issue with shallow vs deep constness on reference-like types is a common C++ behaviour, see e.g. pointers, both language and stdlib, or span. Returning const T from a const_iterator doesn't help, because it will turn back into plain T at slightest provocation:

const int F1();

auto foo = F1();

foo is plain int, the const went away, so trying to rely on this to avoid mutation is very brittle. If you want const_iterator's value to provide deep immutability, you have to define a custom const-reference type to be const_iterator::value_type.

Realistically that is already what is going to happen in user code. The code that triggered this is something like this

template <typename InputIterator,
          typename InputSentinel,
          typename ResultType = typename std::iterator_traits<InputIterator>::value_type>
std::vector<ResultType> from_range(InputIterator from, InputSentinel to) {
    return std::vector<ResultType>(from, to);
}

template <typename Container>
auto from_range(Container const& cnt) {
    using std::begin;
    using std::end;
    return from_range( begin( cnt ), end( cnt ) );
}

which fails to compile with yaml-cpp as the container-like, because you cannot have std::vector<const T>. The simple fix is to add a remove_const when computing the ResultType, which will make the returned vector full of mutable values again.

jbadwaik commented 1 week ago

@KitsuneRal You can use element_type to communicate the const-ness of types and use value_type for cv-removed values. That's how std::span does it.

KitsuneRal commented 1 week ago

I cannot because I'm not developing yaml-cpp, I'm only using it and that's how I happen to know the pitfalls. If you look at the library code you will realise that it's much more complicated than "just" separating element types from value types. The Node class is shallow-copied, you can't help it. And yes, that const is quite fickle, it's all too easy to lose it - which is why I only consume yaml-cpp via my own wrapper that effectively prevents me from making any changes to the underlying YAML::Node objects.

Instead of returning a vector in from_range() above, I would suggest working in terms of ranges until the point where you don't need yaml-cpp anymore, and then convert them to the specific C++ types. If you know to have a list of specific structures in your YAML, you can convert each of them right in from_range() algorithm. If the container is heterogenous, use std::for_each(), range for loops with switches inside them, or any other way to implement a visitor pattern. A vector of YAML::Nodes is a bad thing in principle, you will only shoot more feet, yours and others', down the line. You should either be in the yaml-cpp realm (=interact with YAML::Node), or in the non-yaml-cpp realm, not in both within the same type. The library is not well-equipped for the mixed mode at the moment.