Relaxing the Common<T, U> requirements for the relational concepts

ericniebler / stl2

LaTeX and Markdown source for the Ranges TS/STL2 and associated proposals

88 stars 8 forks source link

Relaxing the Common<T, U> requirements for the relational concepts #34

Closed CaseyCarter closed 5 years ago

CaseyCarter commented 9 years ago

Some LEWG reviewers mentioned this, and it emerged again in the discussion of issue #17, so I thought I should write up something more detailed. I went a bit overboard, this may need to be an LEWG paper instead of an "issue":

Abstract

During LEWG review of N4382, several people mentioned that they are uncomfortable with the fact that some of the cross-type concepts - specifically EqualityComparable<T, U> and TotallyOrdered<T, U> - require the user to exhibit a common type for the two subject types whose semantics mirror those of the cross-type relationships. This objection is not unreasonable as the requirements are overconstrained, which I will demonstrate. Overconstraining is not necessarily a misfeature; redundancy in a system allows for error-checking. For example, given two types T and U that model such a cross-type concept we could define a wrapper types T' and U' with error-checking operations that assert the equivalence of results from both the underlying operation on T and U and the operation on the common type. However, not every user will want or need this error-checking capability, so many see the fact that the concepts are overconstrained as a violation of the "Don't pay for what you don't use" principle of C++. I present here a relaxation of those concepts that does not require a common type.

Overconstraint

I claim that the two concepts in question are overconstrained because the requirement that a common type exist is equivalent to the requirement that the two subject types have cross-type relational operators that respect the semantics of the type-specific relational operators. I will use EqualityComparable to make this argument concrete and hope that it is clear enough to the reader that the argument can be generalized to cover TotallyOrdered as well. To prove equivalence we must demonstrate that (a) given a common type we can construct cross-type relational operators with consistent semantics, and (b) given cross-type relational operators with consistent semantics we can construct a common type.

Let T and U be two models of EqualityComparable, and C be a common type of T & U that models EqualityComparable. We can define cross-type relational operators for T and U as:

auto operator == (const T& t, const U& u) {
  return C{t} == C{u};
}

auto operator == (const U& u, const T& t) {
  return C{u} == C{t};
}

auto operator != (const T& t, const U& u) {
  return C{t} != C{u};
}

auto operator != (const U& u, const T& t) {
  return C{u} != C{t};
}

Clearly these cross-type relational operators are by definition consistent with the semantics of the relational operators on C.

For the converse argument, let T and U again be two models of EqualityComparable with semantically consistent cross-type relational operators. (By "semantically consistent", I mean that the cross-type operators respect the equivalences established by the type-specific operators. I.e, it must be the case that (a) the cross-type operators are symmetric in T and U, and (b) t == u for values t of type T and u of type U if and only if t2 == u for all values t2 == t and t == u2 for all values u2 == u.) We can then define a type C that is a discriminated variant of T and U, with relational operators (ignoring the implied requirement for move and/or copy construction of T and U necessary to convert T and U to C):

bool operator == (const C &a, const C& b) {
  if (<a contains a T>) {
    if (<b contains a T>)
      return <a as a T> == <b as a T>;
    else
      return <a as a T> == <b as a U>;
  } else {
    if (<b contains a T>) {
      return <a as a U> == <b as a T>;
    else
      return <a as a U> == <b as a U>;
  }
}

bool operator != (const C& a, const C& b) {
  return !(a == b);
}

Clearly these relational operators for C respect the semantics of the operators on T and U, and the cross-type operators as well. (Those familiar with N4382 will recognize this pattern as being embodied by the common_iterator class template, which provides exactly such a constructive common type for an Iterator and a corresponding Sentinel type. Those familiar with the Library Fundamentals TS may notice that std::optional<T> is a special case of this pattern that constructs a common type for T and std::nullopt with a cross-type operator== that always returns false.)

Proposed Design

I propose the relaxed concepts WeaklyEqualityComparable<T, U> and WeaklyTotallyOrdered<T, U> that do not require Common<T, U>, but instead directly impose the necessary semantic requirements on the cross-type relational operators:

template <class T>
concept bool WeaklyEqualityComparable() {
 return EqualityComparable<T>();
}

template <class T, class U>
concept bool WeaklyEqualityComparable() {
 return WeaklyEqualityComparable<T>() &&
 (Same<T, U> ||
 (WeaklyEqualityComparable() &&
 requires(T t, U u) {
 { t == u } -> Boolean;
 { u == t } -> Boolean;
 { t != u } -> Boolean;
 { u != t } -> Boolean;
 }));
}
Given objects t1 and t2 of type T and u1 and u2 of type U, types T and U model WeaklyEqualityComparable if and only if:

(bool(t1 == u1) == bool(u1 == t1)) != false [== is symmetric with respect to T and U]

(bool(u1 != t1) == bool(t1 != u1)) != false [!= is symmetric with respect to T and U]

(bool(t1 != u1) == !bool(t1 == u1)) != false [!= is the complement of ==]

((bool(t1 == u1) && bool(t1 == t2)) == bool(t2 == u1)) != false [== of T and U respects T's ==]

((bool(t1 == u1) && bool(u1 == u2)) == bool(t1 == u2)) != false [== of T and U respects U's ==]
template <class T>
concept bool WeaklyTotallyOrdered() {
 return TotallyOrdered<T>();
}

template <class T, class U>
concept bool WeaklyTotallyOrdered() {
 return WeaklyTotallyOrdered<T>() &&
 (Same<T, U> ||
 (WeaklyTotallyOrdered() &&
 WeaklyEqualityCompable<T, U>() &&
 requires(T t, U u) {
 { t Boolean;
 { t <= u } -> Boolean;
 { t > u } -> Boolean;
 { t >= u } -> Boolean;
 { u < t } -> Boolean;
 { u <= t } -> Boolean;
 { u > t } -> Boolean;
 { u >= t } -> Boolean;
 }));
}
Given objects t1 and t2 of type T and u1 and u2 of type U, types T and U model WeaklyTotallyOrdered if and only if:

(bool(t1 < u1) == bool(u1 > t1)) != false [Symmetry in T and U]

(bool(t1 > u1) == bool(u1 < t1)) != false

(bool(t1 >= u1) == bool(u1 <= t1)) != false

(bool(t1 <= u1) == bool(u1 >= t1)) != false

(bool(t1 >= u1) == !bool(t1 < u1)) != false [>= is the complement of <]

(bool(t1 <= u1) == (bool(t1 < u1) || bool(t1 == u1))) != false [<= is the union of < and ==]

((bool(t1 < u1) && bool(u1 < u2)) == bool(t1 < u2)) != false [< respects U's <]

((bool(t1 < u1) && bool(t2 < t1)) == bool(t2 < u1)) != false [< respects T's <]

Exactly one of bool(t1 < u1), bool(t1 == u1), or bool(t1 > u1) is true. [Totality]

I separately propose that the library provide a default_common_type class template:

template <class T, class U>
 requires WeaklyEqualityComparable<T, U>
class default_common_type {
public:
 default_common_type(T);
 default_common_type(U);

 friend bool operator == (const default_common_type& a,
 const default_common_type& b);
 friend bool operator != (const default_common_type& a,
 const default_common_type& b);
};
default_common_type<T, U> shall model:

WeaklyEqualityComparable

WeaklyTotallyOrdered if WeaklyTotallyOrdered<T, U>

Iterator if Sentinel<U, T>, with

Same<IteratorCategory<T>, IteratorCategory<default_common_type<T, U>>>

Same<ValueType<T>, ValueType<default_common_type<T, U>>

Same<DifferenceType<T>, DifferenceType<default_common_type<T, U>>

[Implementors would presumably use default_common_type and restrict the iterator category to implement the backwards compatible common_iterator.] I DO NOT PROPOSE that we define:

template <class T, class U>
  requires WeaklyEqualityComparable<T, U>
struct common_type<T, U> { type = default_common_type<T, U>; };

Since the symmetry requirement of Common would be impossible to satisfy; there is no canonical ordering for types. (This problem does not present for Sentinel<S, I> due to the asymmetry of Sentinel.) This issue does not prevent library users from using default_common_type to specialize common_type if desired after choosing an arbitrary ordering for the parameter types, e.g.,

struct A {};
struct B {};

// ...definitions of consistent relational operators for A and B...

namespace std {
template <>
struct common_type<A, B> { using type = default_common_type<A, B>; };

template <>
struct common_type<B, A> { using type = default_common_type<A, B>; };
}

or prevent clients that desire a common type for types T and U from instantiating default_common_type<T, U> if common_type<T, U> has no member type. [The library should possibly provide an alias template that evaluates to T if Same<T, U>, common_type_t<T, U> if it exists, and otherwise default_common_type<T, U>.]

Design options

(1) Ignore this document entirely: leave the Common requirements in place and force users to exhibit common types to satisfy the concepts. (2) Incorporate the relaxed concepts described herein. Relax the requirements on everything in the standard library (to a cursory inspection nothing in N4382 actually needs a common type.) (3) Same as (2), but prefix the names of the stronger concepts with Strong and remove the prefix Weak from the concepts defined herein. The weaker concepts will be used much more, and should therefore have the shorter names. (4) Same as (3), but remove the stronger concepts altogether from the library. If nothing in the library requires the stronger concepts, it should not define them. Users that want the stronger concept can easily define:

template<class T, class U>
concept bool StrongEqualityComparable() {
  return EqualityComparable<T, U> && Common<T, U> &&
    EqualityComparable<CommonType<T, U>>();
}

tomaszkam commented 9 years ago

The use of the StrongEqualityComparable concept instead of WeaklyEqualityComparable greatly impacts expressiveness of the language in the situation when the conceptual common type for entities represented by types exists, but is hard to implement.

Lets consider the situation when the programmer want to create variable length integer type big_integer and make it comparable with build-in floating types. In that case both types share conceptual common type: real number, however providing the efficient implementation on this concept is not trivial. This leave the programmer will following options:

Do not define the heterogeneous comparisons. This solution will impact usefulness of the class.
Define long double or big_integer to be a common type. This solution introduces the conversion that loss the precision, which may be not acceptable. In C++11 such narrowing conversion between build-in type was eliminated for brace-initializer.
Define common type as variant that provides implements comparison operators (as described in paper). This will fix a problem, but will affect any other place where common type are used and expectation is made that it would be in line semantics with original classes.
Define class big_decimal that would be able constructible from both types without loss of precision. Implementation of such type is not trivial when compared to implementation of heterogeneous comparison. In addition will consume programmers time on effort that is not necessary to solve they problem. None of above options seems to be acceptable, considering the fact that such decision will not be necessary if WeaklyEqualityComparable concept would be used in library.

This comment is rephrasing of post from Concepts forum.

sean-parent commented 9 years ago

The argument totally misses the point. When Alex refers to “type” - he doesn’t mean the language construct. He means the mapping from a representation to an entity (see page 2 of EoP).

A discriminated union represents a union of entities and it makes no more (or less) sense to define a equality across types in the union then it does on the individual types.

The requirement of an existence of a common type is the existential existence - a way to say that there exists a type into which both these types can be mapped and then compared to determine if they represent the same entity.

The challenge is how to state that requirement through the concept system. It is pointless to show that you can define a language type that satisfies the construct but is otherwise nonsense.

ericniebler commented 9 years ago

A discriminated union represents a union of entities and it makes no more (or less) sense to define a equality across types in the union then it does on the individual types.

I believe that's Casey's point. Since you can construct such a variant type that satisfies the concept, it doesn't make sense to force people to do so. But...

It is pointless to show that you can define a language type that satisfies the construct but is otherwise nonsense.

Right. Casey gets his variant argument from my paper, where the variant technique is used to implement common_iterator. The difference is that common_iterator is an iterator, and a useful one at that. The default_common_type suggested above is not a representation of a value in any semantically meaningful domain. You can't do anything with it, and its existence doesn't prove that comparing T and U is sound.

tomaszkam commented 9 years ago

The requirement of an existence of a common type is the existential existence - a way to say that there exists a type into which both these types can be mapped and then compared to determine if they represent the same entity.

What about the situations that I have described, when such type can be easily pointed out but is hard to implement (Palo Alto TR mentions Boost.Graph as example). Do we want force user to use some hacks (define variant, allow narrowing conversion) if they are unable to provide representation of conceptual common type?

tomaszkam commented 9 years ago

Also I strongly believe that the ComonType<T,U> requirement should be removed from Relation<T,U> and as consequence StrickWeakOrder<T,U>. With current requirements the code that use equal_range with custom comparator to extract employee objects with given surname from vector of employees that is already sorted by name will not be valid (no common type for string and employee).

I know that for that case I may use the projection that will map from employee to its name. But then what about a situation when construction of key is expensive (for example return dynamic number of elements and requires allocation). In case when heterogeneous comparator is used, not temporary collection is needed, because I can issue lower_bound inside of it with another heterogeneous comparator.

Real life example would be searching in vector of itineraries sorted by departures of their flights via vector of days. The projection function would require allocation of vector days of flights for each checked itinerary.

asutton commented 9 years ago

I'm okay with all of this. If there exists a way to construct a common type, then it's sufficient to say that those overloads appeal to the construction --- or something like that. We don't really need a witness for the existential in the program.

The only issue I worry about is people providing some overloads for interoperability, but not all. But those will be their bug reports and not ours. On May 27, 2015 2:24 PM, "Eric Niebler" notifications@github.com wrote:

A discriminated union represents a union of entities and it makes no more (or less) sense to define a equality across types in the union then it does on the individual types.

I believe that's Casey's point. Since you can construct such a variant type that satisfies the concept, it doesn't make sense to force people to do so. But...

It is pointless to show that you can define a language type that satisfies the construct but is otherwise nonsense.

Right. Casey gets his variant argument from my paper, where the variant technique is used to implement common_iterator. The difference is that common_iterator is an iterator, and a useful one at that. The default_common_type suggested above is not a representation of a value in any semantically meaningful domain. You can't do anything with it, and its existence doesn't prove that comparing T and U is sound.

— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/34#issuecomment-106023854.

CaseyCarter commented 9 years ago

A discriminated union represents a union of entities and it makes no more (or less) sense to define a equality across types in the union then it does on the individual types.

I believe that's Casey's point. Since you can construct such a variant type that satisfies the concept, it doesn't make sense to force people to do so. But...

Yes, exactly. The requirement "std::common_type_t<T, U> must model Foo" says nothing more or less than the requirement "cross-type operations must implement the requirements of Foo " about whether or not it is sound to try to model Foo with T and U in the first place. Both ensure that Foo<T, U> is consistent with Foo<T> and Foo, which is the best that can be expected of a syntactic programming language construct. I suspect that soundness is not in general a computable property. There's a constructive argument to be made that if someone goes to the trouble to implement either requirement on T and U then it must be sensible to do so in their mental model of what a T and a U represent.

The default_common_type suggested above is not a representation of a value in any semantically meaningful domain. You can't do anything with it, and its existence doesn't prove that comparing T and U is sound.

Viewed purely as a programming language construct, use of CommonType<T, U> is type erasure. I can use the result of (foo ? t : u) without concern for whether the result came from t or from u. common_iterator erases the difference between an iterator and sentinel type to allow us to pass their values to algorithms that expect identically-typed iterators. default_common_type is a generalization that erases the difference between two types but maintains EqualityComparable/TotallyOrdered when possible. It is sadly useless for anything other than satisfying the Common requirement of the stronger concepts since there's no way to recover or visit the stored value - I withdraw the suggestion that the library should provide it. There's no use for it if nothing requires the stronger concepts and making it useful would rapidly turn into a replication of std::variant.

In an ideal world, CommonType<T, U> would be an automatically generated efficient implementation of a type that models the intersection of the concepts modeled by T and U. I strongly doubt such a thing could ever be implementable, but I envision that default_common_type would gradually approach it over time with the addition of "just one more useful concept" until it becomes a hideous mass of overloads and specializations.

Also I strongly believe that the ComonType<T,U> requirement should be removed from Relation<T,U> and as consequence StrickWeakOrder<T,U>.

I'm not prepared to address Relation or Swappable at this time. I'm inclined to believe that Relation<R, T, U> should not require Common<T, U>, Relation<R, T> or Relation<R, U>, e.g., has_name(Person, string) seems a valid relation to me despite all of has_name(Person, Person), has_name(string, string), and Common<Person, string> being unsound.

Swappable is a bit of a mess right now considering its different semantics for lvalue and rvalue expressions. I'm not sure it's sound with or without Common. I've been trying to formulate wording that says two expressions are swappable if they denote stored values - not necessarily objects - of the same type, but it hasn't gone well. I suspect the committee would not agree with me that swapping vector<bool>::reference& should be valid and have different semantics (exchange references) than swapping vector<bool>::reference&& (exchange referents).

sean-parent commented 9 years ago

Not having the time to look at all the places these concepts are used - there are two distinct notions at play here:

Common type should be used for the definition of equality and natural total order (operator==() and operator<() ) as the common type is a mechanism to define the semantics of these operations for cross type comparisons.

The requirements for comparison functions passed as arguments may also allow for a projection function. If we made the projection function explicit we could eliminate this complexity and just use common type.

That is, a comparison, op, passed to lower_bound() needs to be implemented such that there exists a common type C and a projection function p such that C{p(_i)} op' C{v} is a strict weak ordering for all i in the range [f, l) and is consistent with the ordering used to sort [f, l) - that is, the range [f, l) is sorted as if by C{p(_i)} op' C{p(*j)}.

As Andrew points out, we don't need a witness to the common type (or to the projection function).

Implementing a "convenient" operator<() that is defined in terms of a projection should be a violation of the required semantics (a pre-condition since there is no way to detect syntactically).

Sean

On Wed, May 27, 2015 at 11:43 AM, tomaszkam notifications@github.com wrote:

Also I strongly believe that the ComonType<T,U> requirement should be removed from Relation<T,U> and as consequence StrickWeakOrder<T,U>. With current requirements the code that e use equal_range with custom comparator to extract employee objects with given surname from vector of employees that is already sorted by name will not be valid.

I know that for that case I may use the projection that will map from employee to its name. But then what about a situation when construction of key is expensive (for example return dynamic number of elements and requires allocation). In case when heterogeneous comparator is used, not temporary collection is needed, because I can issue lower_bound with another heterogeneous comparator.

— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/34#issuecomment-106028423.

tomaszkam commented 9 years ago

As Andrew points out, we don't need a witness to the common type (or to the projection function).

I don't argue with existence of this requirement at the required semantics level. For me the whole issue is about the fact that currently common type is syntactical requirement (required to be provided in program).

In my comments I was trying to present the situations when conceptually all requirements all fulfilled (we are able to point out common type and the appropriate projection), but imposing them syntactically place unnecessary burden on programmer.

tvaneerd commented 9 years ago

A couple of comments:

we already rely on the external "entities" when defining (single type) equality:

"(a == b) != false if and only if a is equal to b

That "equal" doesn't mean == obviously. It means the equality of the entities that a and b represent.

We could use the exact same definition for cross-type equality, relying on the same "a == b iff the entities they represent are equal". Just because a and b are different types, doesn't mean the don't represent the same entities.

In general, I'd prefer to NOT rely on things external to the language, but it seems we need to for at least single-type equality anyhow.

Also, in the "Proposed Design" section, you have:

((bool(t1 == u1) && bool(t1 == t2)) == bool(t2 == u1)) != false [== of T and U respects T's ==]

Depending on our definition of equal, that is already implied - ie t1 and t2 being equal implies that (for Regular functions) f(t1) == f(t2). So t1 == t2 and t1 == u1 gives us t2 == u1, assuming "== u1" is Regular. Do we already require == to be a Regular function? (should we?)

CaseyCarter commented 9 years ago

I don't think that "is equal to" is formally defined. Since a and b are objects, we should probably use the "have the same value" language from N2479. Note: I'm trying to avoid introducing "a == b iff the values of a and b represent the same entity" since "entity" is already a term of art in C++:

An entity is a value, object, reference, function, enumerator, type, class member, bit-field, template, template specialization, namespace, parameter pack, or this.

Doing so would require us to define abstract entity as a thing unrelated to entity which I believe would only degrade the clarity of the specification.

a == b and a != b are expressions that do not necessarily involve calling a function, so rather than requiring operator==(T, T) and operator==(T, U) to be regular functions we must require that those expressions are equality preserving. Although EqualityComparable<T> does not explicitly say "==/!= must be equality preserving," the semantic requirements effectively force them to be so. The semantic requirements for EqualityComparable<T, U> and WeaklyEqualityComparable<T, U> also effectively require ==/!= to be equality preserving.

I think the presentation could be simplified by introducing "equality preserving" as a term of art instead of dancing around specifying it explicitly in various concept definitions and hinting at it in notes. Ideally before the definition of "regular function" in [concepts.lib.general]/4, from N3351:

An expression is equality preserving if, given equal inputs, the expression results in equal outputs. Unless explicitly stated otherwise, any expression appearing in a concept definition in this document is required to be equality preserving.

[ Note: Not all input values are valid for a given expression, e.g., for integers a and b, the expression a/b is not well-defined when b is 0. This does not preclude the expression a/b being equality preserving. —end note ]

[ Note: An expression with type void is necessarily equality preserving. —end note ]

And rephrase/4 as:

A regular function is a function that is equality preserving, i.e., a function that returns equal output when passed equal input. A regular function that returns a value may copy or move the returned object, or may return a reference. [ Note: Regular functions may have side effects that do not participate in determining the output. —end note ]

Having the explicit definition of equality preserving allows us to refer to it directly in the concepts.

Common: Given that CommonType<T, U>(t) and CommonType<T, U>(u) would now be implicitly required to preserve equality, we need only explicitly state that they preserve inequality as well.
Boolean (and others) is less broken, since it was already assuming the expressions are equality-preserving.
EqualityComparable<T> becomes:
- bool(a == b) is true if and only if a and b have the same value.
- bool(a != b) == !bool(a == b)
Reflexivity, transitivity, and symmetry of == are implied by the "have the same value" requirement, so we need not state them explicitly. The note in para 2 is obsoleted by the note added above in the definition of equality preserving.
WeaklyEqualityComparable<T, U> can be simplified:
Given objects t of type T and u of type U, types T and U model WeaklyEqualityComparable if and only if:
- bool(t == u) == bool(u == t) [== is symmetric with respect to T and U]
- bool(u != t) == bool(t != u) [!= is symmetric with respect to T and U]
- bool(t != u) == !bool(t == u) [!= is the complement of ==]
Since t == u and t != u must be equality preserving, both necessarily respect the == and != of T and U.
EqualityComparable<T, U>:
- bool(a == b) == bool(C(a) == C(b))
- bool(a != b) == !bool(a == b)
- bool(b == a) == bool(a == b)
- bool(b != a) == bool(a != b)
TotallyOrdered<T>: The note in para 2 is redundant. Requirements 1.1, 1.2, and 1.5 are implied by 1.4 in conjunction with the requirement that expressions are equality preserving. We're left with:
- Exactly one of bool(a < b), bool(a == b) or bool(b < a) is true.
- bool(a = b) == bool(b < a)
TotallyOrdered<T, U>:
- Exactly one of bool(a < b), bool(a == b) or bool(b < a) is true.
- bool(a < b) == bool(C(a) < C(b))
- bool(a > b) == bool(b < a)
- bool(b > a) == bool(a < b)
Function<F, Args...>: No change - it already has an explicit notation that the function call expression need not be equality preserving.
Relation<T, U>: Again shorten the requirements, e.g., bool(r(a, b)) == bool(r(C(a), C(b)))

Sorry, I've written yet another book. I'd be happy to submit pull requests for any of the above changes - I don't expect anyone to do the work for me ;)

asutton commented 9 years ago

Hi Tony,

we already rely on the external "entities" when defining (single type) equality:

"(a == b) != false if and only if a is equal to b

That "equal" doesn't mean == obviously. It means the equality of the entities that a and b represent.

The intent of the wording is to work backwards. A definition of == that evaluates to true puts those objects into the "equal" relation. The wording for copy construction is written the same way. With "equal" meaning substitutable in regular functions.

Obviously, one can design an == that fails to meet that criteria. A type that does so should be considered ill-conceived (like ill-formed, but with aspersions cast upon the implementer).

Depending on our definition of equal, that is already implied - ie t1 and t2 being equal implies that (for Regular functions) f(t1) == f(t2). So t1 == t2 and t1 == u1 gives us t2 == u1, assuming "== u1" is Regular. Do we already require == to be a Regular function? (should we?)

We don't require == to be a regular function. I suspect we get into a logical cycle doing so. We should be defining "equals" as an equivalence relation, which (I believe?) would satisfy those requirements.

Andrew

ericniebler commented 9 years ago

@CaseyCarter writes:

I think the presentation could be simplified by introducing "equality preserving" as a term of art instead of dancing around

You could be right. I want to be careful not to define equality preserving in terms of equal values, and equal values in terms of equality preserving. I think your formulation above steers clear of that. Instead, it pushes the problem down. Now instead of failing to defining "equal to", we now lack a definition for what "equal values" means. It's turtles all the way down.

I haven't read N2479, so maybe it has a formulation that works. But I want to be careful not to adopt any new terminology that would cause ripples to spread through the entire library specification, even if it would be an improvement. My plate is full enough. :-)

BTW, requiring expressions in concepts to be equality preserving unless otherwise specified is a nice addition, and I think it's the right default (someone will correct me if I'm wrong, @asutton?). Using it to get reflexivity, transitivity, and symmetry in EqualityComparable for free is pretty clever (but IMO deserves a note).

asutton commented 9 years ago

I don't think that "is equal to" is formally defined. Since a and b are objects, we should probably use the "have the same value" language from N2479. Note: I'm trying to avoid introducing "a == b iff the values of a and b represent the same entity" since "entity" is already a term of art in C++:

How much of this document do you want to dedicate to describing what it means to "have a value"? Because if you adopt that language, you have to explicitly define the value of every single type in the language and the standard library.

Where will you document the value of things like double and void? And how will you word that?

I think this wording is best avoided. The ultimate observers of when two objects have the same value is to invoke its == operator (unless the class is fundamentally broken).

An expression is equality preserving if, given equal inputs, the expression results in equal outputs. Unless explicitly stated otherwise, any expression appearing in a concept definition in this document is required to be equality preserving.

Agree.

[ Note: Not all input values are valid for a given expression, e.g., the expression for ints a and b, a/b is not well-defined when b is 0. This does not preclude the expression a/b being equality preserving. —end note ]

Seems reasonable.

[ Note: An expression with a void value is necessarily equality preserving. —end note ]

Probably unnecessary.

A regular function is a function that is equality preserving, i.e., a function that returns equal output when passed equal input. A regular function that returns a value may copy or move the returned object, or may return a reference. [ Note: Regular functions may have side effects that do not participate in determining the output. —end note ]

"may have side effects" should probably be sufficient.

Andrew

CaseyCarter commented 9 years ago

Now instead of failing to defining "equal to", we now lack a definition for what "equal values" means.

Values are "one discrete element of an implementation-defined set of values" ([basic.types]/4). I think we can fallback on the mathematical notion of identity - the standard refers to "equal values" in many places, IIRC.

asutton commented 9 years ago

Now instead of failing to defining "equal to", we now lack a definition for what "equal values" means. Values are "one discrete element of an implementation-defined set of values" ([basic.types]/4). I think we can fallback on the mathematical notion of identity - the standard refers to "equal values" in many places, I'm sure.

Implementation-defined refers to the compiler and the standard library. Adopting that notion would vacate the idea of equality for types that are not implementation-defined.

Andrew

asutton commented 9 years ago

You could be right. I want to be careful not to define equality preserving in terms of equal values, and equal values in terms of equality preserving. I think your formulation above steers clear of that. Instead, it pushes the problem down. Now instead of failing to defining "equal to", we now lack a definition for what "equal values" means. It's turtles all the way down.

Equality is like that. My recommendation: don't try to define what equality is. We have observers of equality: copy constructors and == operators.

An expression that is equality preserving is one that, given equal inputs, produces equal outputs.

Yes, you can break these definitions by defining bad copy constructors and == operators. So don't do that.

I haven't read N2479, so maybe it has a formulation that works. But I want to be careful not to adopt any new terminology that would cause ripples to spread through the entire library specification, even if it would be an improvement. My plate is full enough. :-)

Adopting wording from this paper would do that.

BTW, requiring expressions in concepts to be equality preserving unless otherwise specified is a nice addition, and I think it's the right default (someone will correct me if I'm wrong, @asutton https://github.com/asutton?). Using it to get reflexivity, transitivity, and symmetry in EqualityComparable for free is pretty clever (but IMO deserves a note)

We do exactly that in n3351. Run with it :)

Andrew

CaseyCarter commented 9 years ago

Implementation-defined refers to the compiler and the standard library. Adopting that notion would vacate the idea of equality for types that are not implementation-defined.

Fair enough. I appeal to N2479 for its excellent definition of "value," which was presumably rejected by the committee. I suppose we could continue with the handwaving definition of "equal values" that the Standard uses now.

We do exactly that in n3351. Run with it :)

Yep - indirectly stolen from N3351.

ericniebler commented 9 years ago

I appeal to N2479 for its excellent definition of "value," which was presumably rejected by the committee.

As you know, it's never safe to assume the committee actively rejected a paper just because it wasn't adopted, or if it was rejected that the committee was correct in doing so. ;-)

CaseyCarter commented 9 years ago

it's never safe to assume the committee actively rejected a paper just because it wasn't adopted

I'm sure it wouldn't be the only paper that fell by the wayside in 2008 during the rush to finish C++09 ;)

tvaneerd commented 9 years ago

So I went through the standard and read all occurrences of "value". Because if that isn't the definition of 'fun', I don't want to know what is.

(I actually went through it a month ago, so I may have forgotten some of the details of my conclusions, sorry.)

I agree with John's definition of value, however I don't think it can work as the definition throughout the standard. It might be workable in Library, but not in Core, I think.

The standard only has these base values: true, false, numbers (as in math), some character things, and pointers. ie bool, ints, floats, double, char, char *, etc. (and oddities like nullptr - which is a unique value).

So some of these do reference their "platonic" outside-the-system values. In particular the numbers. The characters are not as "far" outside the system, they are mostly implementation defined (with some constraints - ie unsigned char is also mathematical, guaranteed to follow modulo arithmetic, etc). The booleans are actually inside the system, as they are completely defined by how they are used in if/while/for etc expressions (and conversion to number). And pointers are completely within the system.

So numbers are where we "lean" on platonic concepts outside the system. We lean on mathematics.

It is probably not hard to define equality on these base types, actually. For numbers, we refer to math. For boolean, true equals true, false equals false, and true does not equal false. For characters, we look at them as numbers, not as the characters they represent. For pointers, I think it may already be defined in the standard, as there is lots of wording on whether something points to a valid object, when base and derived can have the same pointer, etc, etc, etc.

Next, the value of a struct or class, although never expressly stated in the standard, is clearly the product of the values of all the members of the struct or class. All of them, not just the ones that transfer via the copy constructor etc. There are many places in the core wording that assume that "value" means all the values within the object (but not necessarily the padding bits, as these do not make up the value). ie atomic must treat all members as part of the value (else you would have surprises). The whole memory model assumes value means all the sub-values of an object.

Obviously the "class-defined" (via copy-ctor, assignment, ==, etc) value of a class aligns well with the core concept of value in the standard, and that's why the language allows you to override the copy-ctor, and will call it in places where the "value" of the object is meant to be preserved (ie pass by value and return by value). And we allow copies to be elided for the same reason. But there are places in the standard where the class-defined value and the core-defined value are not the same.

I don't know how we would resolve that in the standard - we would need another word for "value". (Or spell out class-defined and core-defined when necessary, but that is a bit verbose).

I now almost forget how that relates to the discussion, but that's what I found when reading 3000+ occurrences of "value" in the standard.

CaseyCarter commented 9 years ago

We had a bit more discussion of "what equals means" in the 20150713 telecon. I think the final conclusion we arrived at is that we don't really care: the designer of a value type defines what equality means for that type as an equivalence relation. The only semantically important feature is that equality-preserving expressions are required to preserve it, and for models of EqualityComparable, == is exactly that equivalence relation.

asutton commented 9 years ago

We had a bit more discussion of "what equals means" in the 20150713 telecon. I think the final conclusion we arrived at is that we don't really care: the designer of a value type defines what equality means for that type as an equivalence relation. The only semantically important feature is that equality-preserving expressions are required to preserve it, and for models of EqualityComparable, == is exactly that equivalence relation.

That's the gist more or less, but I don't think we want an active discussion during the telecon. Just review.

ericniebler commented 9 years ago

@asutton took a stab at defining "equal to" but it had problems. Maybe @CaseyCarter has a point, and that this is a better direction:

The only semantically important feature is that equality-preserving expressions are required to preserve it, and for models of EqualityComparable, == is exactly that equivalence relation.

In this formulation, we don't have to appeal to math or equality of built-in types or even define "value". It's just an equivalence relation.

I think we get caught in infinite regress if we try to say equality is unaffected by equality-preserving expressions, though.

tvaneerd commented 9 years ago

So the assumptions we make are that

== is an equivalence relation, (and we are able to define equivalence relation - including that the result of == must be convertible to bool so that we can then use bool's definition of == to define (a == b) == (b == a), etc (ie the middle == there is well defined bool == bool)
all expressions preserve this relation, and we can define what we mean by that, using types' == (I would call "equality-preserving expressions" as "Regular expressions"... if that didn't already have meaning)

I think that works.

And now, note that we could do the same for T == U, without CommonType, if we so choose.

CaseyCarter commented 9 years ago

I think we get caught in infinite regress if we try to say equality is unaffected by equality-preserving expressions, though.

Sorry, I should have said that for models of EqualityComparable the relation induced by == must be exactly the equality relation. There's no recursion in the definitions then since both "equality-preserving" and == depend on the definition of the equality relation and there are no dependencies in the opposite direction.

tvaneerd commented 9 years ago

Sorry, I should have said that for models of EqualityComparable the relation induced by == must be exactly the equality relation. There's no recursion in the definitions then since both "equality-preserving" and == depend on the definition of the equality relation and there are no dependencies in the opposite direction.

And what is the equality relation?

CaseyCarter commented 9 years ago

And what is the equality relation?

The designer of a type determines what "a equals b" means for values of that type; that relation ("equality relation") is necessarily an equivalence relation. For example, I define struct default_sentinel { }; and decide that all values of my type are equal.

If the type is intended to model a concept, then the expressions that concept requires to be equality-preserving must be implemented to respect the type's equality relation. If I decide that default_sentinel should model EqualityComparable, then I have to implement == and != so that they respect my equality relation:

bool operator==(default_sentinel, default_sentinel) { return true; }
bool operator!=(default_sentinel, default_sentinel) { return false; }

tvaneerd commented 9 years ago

So "for models of EqualityComparable the relation induced by == must be exactly the equality relation" means "for models of EqualityComparable the relation induced by == must be exactly the relation defined by ==" ?

tvaneerd commented 9 years ago

Or "for models of EqualityComparable, the relation induced by == must be an Equivalence Relation" (and != must be !(==) )

CaseyCarter commented 9 years ago

The definition of equality for a type is a design feature orthogonal to the definition or existence of the == operator. This must be the case since we cannot require all types to implement == yet we must still have a notion of what it means for two objects to be equal. Copyable for example doesn't require ==, but it requires that the result of copy construction is equal to the object being copied. The exact definition of equality is not important, except that "a equals b" must be an equivalence relation: it must be reflexive (a equals a is true), symmetric (a equals b if and only if b equals a), and transitive (if a equals b and b equals c, then a equals c).

"for models of EqualityComparable the relation induced by == must be exactly the equality relation" means that bool(a == b) is true if and only if a equals b. That fact together with the requirements that a == b be equality preserving and that "equals" is an equivalence relation suffice to prove that bool(a == b) is an equivalence relation as well.

tvaneerd commented 9 years ago

So we are not defining what equals (or value) means then?

ericniebler commented 9 years ago

We are defining what "equals" means, insofar as how the term is used in the library. And no, we should not get into the business of defining "value", IMO. That's a rat hole I prefer to avoid.

EDIT: our definition of "equals" only needs to be strong enough to give the semantic constraints meaning. I think we're almost there.

tvaneerd commented 9 years ago

bool(a == b) is true if and only if a equals b. That fact together with the requirements that a == b be equality preserving and that "equals" is an equivalence relation suffice to prove that bool(a == b) is an equivalence relation as well.

You don't need "together with a == b be equality preserving". If bool(a == b) IFF a equals b, then == has all the properties of "equals" so you already have that == is an equivalence relation. As well as that == is equality preserving.

Also "equals" (and ==) is equality preserving via transitivity (and symmetry to get substitution in both positions).

tvaneerd commented 9 years ago

We are defining what "equals" means, insofar as how the term is used in the library. And no, we should not get into the business of defining "value", IMO. That's a rat hole I prefer to avoid.

Agreed on value. So where do you think we are on equals then? Do we need "equals" for types that don't have == ?

ericniebler commented 9 years ago

Do we need "equals" for types that don't have == ?

Yes.

CaseyCarter commented 9 years ago

Do we need "equals" for types that don't have == ?

To be precise: equality-preserving expressions that have an operand or result of non-void type T need "equals". There are concepts that require equality-preserving expressions for such a T which do not require == to be defined for T. A type that does not participate in equality-preserving expressions need not have a notion of equality.

tvaneerd commented 9 years ago

So, the options, as I see them

handwave - don't define equality (similar to not defining value). Assume "you know what I mean"
platonic - reach outside the system and say equality is defined there (is this the same as 1?)
core - say equality is member-wise equality, down to base types, where it is well defined
== - define equality as "whatever == does" with the additional constraints that == must be an equivalence relation. Screw types that don't have equality.
- Use identity for types that don't define ==. (ie two Mutexes are == only if they are the same object)
- Use "they are all equal" for any types (without ==) that have no members [*]
Some combination of the above (ie use == for types that have it, use core otherwise)

[*] I've needed std::less<T> to have ==. All instances of std::less<Foo> should be equal.

Am I missing anything?

asutton commented 9 years ago

handwave - don't define equality (similar to not defining value). Assume "you know what I mean"

We can handwave a little more strongly and say that the semantics of == is defined by each type, but it must be an equivalence relation.

platonic - reach outside the system and say equality is defined there (is this the same as 1?)

It's pretty close. We effectively say this in N3351, and defined an omniscient eq() predicate that "know" how to evaluate for each type.

core - say equality is member-wise equality, down to base types, where it is well defined

A non-starter because it omits user-defined definitions.

== - define equality as "whatever == does" with the additional constraints that == must be an equivalence relation. Screw types that don't have equality. 4.5. Use identity for types that don't define ==. (ie two Mutexes are == only if they are the same object) 4.6 Use "they are all equal" for any types (without ==) that have no members [*]

Reasonable. I think that's the intent of what we have now.

[*] I've needed std::less to have ==. All instances of std::less should be equal.

Am I missing anything?

Yes...

There was, at one point, a version of the wording that tried to give a constructive definition of "equals" based on the observation of certain operations and declarations. It wasn't fully fleshed out, and needs some work, but I think that it was the right approach.

The basic idea is that there is an equivalence relation called "equals" and it is defined by the observation of the results of certain operations and declarations.

If T is equality comparable and a == b is true, then a is equal to b.
If T is copy constructible then the declaration "T a = b" declares a to be equal to b
If T is copy assignable, then "a = b" assigns a to be equal to b.

And then there is some waffling about modification after the fact. And if a UDT defines == or copy in a way that does not preserve equality in an expression where it is required, the program is ill-formed, no diagnostic required (a hard library error).

This clearly can't be the complete set of definitions that define ==. Every type defines several:

vector v1 { 0, 1, 2 }; vector v2 { 0, 1, 2 };

v1 and v2 are obviously equal.

But this needs to be done for every data type in the library.

To me, this feels like the right approach. I tried talking to John about it, be he wasn't hearing it :)

Andrew

tvaneerd commented 9 years ago

If T is copy constructible then the declaration "T a = b" declares a to be equal to b

If T is copy assignable, then "a = b" assigns a to be equal to b.

I think you definitely want something somewhere that says "we assume T a = b means we can substitute a for b, as does a = b". Otherwise our algorithms can't make temporaries, which can really hamper your day.

Not sure if you can define equals from construct/assign, but you definitely want to be able to assume those aspects of Regular. (Along with copies being disjoint)

I think, overall, for T == T the goal is to get substitutibility. So we say if a == b, then a can substitute b. Same for T a = b, and a = b.

That makes the value of T the important part. The value can be copied around all you want, it is still the same value.

HOWEVER, for T == U, the goal is NOT substitution. Because a single function can't substitute a T for a U (well, template functions can, but we then consider each instantiation a different function). Nor is the goal 'value' - we do NOT expect that you can copy a value from T to U to W to X then back to T and have the same value. (At least I don't think we have algorithms that need that.)

So for T == U the definition of "equals" isn't very meaningful, and it really is just the syntax - either the comparison compiles or doesn't.

tomaszkam commented 8 years ago

Somehow related question: I would like to ask if my understanding is correct. For the following definitions:

struct Employee { 
  std::string const& id() const;
  //Other members
};

struct IDComparator
{
  bool operator()(Employee const& e, std::string const& id) const
  { return e.id() < id; }

  bool operator()(std::string const& id, Employee const& e) const
  { return id < e.id(); }
}

std::vector<Employee> ve; //Sorted by id()

The STL1 code in the form will no longer compile: std::binary_search(ve.begin(), ve.end(), std::string("some-id"), IDComparator()); Because we will now require that following will need to be satsfied:

IndirectCallableStrictWeakOrder<IDComparator, std::string const*, std::vector<Employee>::iterator>
StrictWeakOrder<IDComparator, std::string, Employee>
Relation<IDComparator, std::string, Employee>
Common<std::string, Employee> which is not satisfied

And will need to be rewritten to? std::binary_search(ve.begin(), ve.end(), std::string("some-id"), std::less<>, std::mem_fn(&Employee::id)); And every instance of heterogeneous comparator will need to be replaced with projection and homogenous comparator?

CaseyCarter commented 8 years ago

std::binary_search will continue to work as it does in C++14 since no one has proposed a change to std::binary_search. std::experimental::ranges::binary_search would reject those arguments since Relation<IDComparator, std::string, Employee> is not satisfied because all of:

Common<std::string, Employee>
Relation<IDComparator, std::string>
Relation<IDComparator, Employee> are unsatisfied.

As you suggest,

ranges::binary_search(ve, std::string("some-id"), std::less<>, &Employee::id);

would be a valid rewrite. If the semantics of Employee::id are such that no two employees can ever have the same ID, then it would be sound for string to be the common type of string and Employee, which is easily expressed by adding a conversion from Employee to string:

struct Employee { 
  std::string const& id() const;
  operator std::string const& () const {
    return id();
  }
  //Other members
};

allowing us to write instead:

ranges::binary_search(ve, std::string("some-id"), std::less<std::string>{});

(No, std::less<> won't work without specifying string because string's less than operator:

template<class charT, class traits, class Allocator>
  bool operator< (const basic_string<charT,traits,Allocator>& lhs,
                  const basic_string<charT,traits,Allocator>& rhs) noexcept;

can't deduce through the user-defined conversion.)

tomaszkam commented 8 years ago

So for the following use case (from actual production code base):

struct Date; //date representation, may be boost::gregorian::date 

struct Leg
{
  Date departure_date() const;
  //Other fields. It does not make sense to have conversion to the date
};

struct Itinerary
{
    std::vector<Leg> const& legs() const;
    //Other fields. It does not make sense to have conversion to the vector of legs
};

struct LegDepartureComparator
{
    bool operator()(Date d, Leg const& l) const
    { return d < l.departure_date(); }

    bool operator()(Leg const& l, Date d) const
    { return  l.departure_date() < d; }
};

struct ItineraryDepartureComparator
{
    bool operator()(std::vector<Date> const& ds, Itinerary const& i) const
    { return std::lexicographical_compare(ds.begin(), ds.end(),
                                          i.legs().begin(), i.legs().end(),
                                          LegDepartureComparator()); }

    bool operator()(Itinerary const& i, std::vector<Date> const& ds) const
    { return std::lexicographical_compare(i.legs().begin(), i.legs().end(),
                                          ds.begin(), ds.end(),
                                          LegDepartureComparator()); }
};

std::vector<Itinerary> itineraries; //sorted lexicographically on departures on each leg
std::vector<Dates> dates; //specific departure combination to find

Now if I would like to migrate following code to ranges: std::equal_range(itineraries.begin(), itineraries.end(), dates, ItineraryDepartureComparator());

I would need to either:

prepare projection from Itinerary to std::vector<Date>
add above as conversion to Itinerary

And adding such projection would cause creation of temporary vector for each comparison (we are not proposing any views of modified containers at this point) would be unacceptable.

I want to point out that migration to STL2 would require either major rewrite of such cases or would be unacceptable in same situations (like one presented above). I would like to see why we consider such code to be so flawed (mathematically unsound. exploiting bugs in old specification), that it needs to be rewritten during migration to STL2 (I assume that we want people to migrate). I personally failed to point out any problem in above example, but I am used to see/write code similar to one above.

asutton commented 8 years ago

Just an observation: a better solution to the original problem be to sort() on the projected employee id instead of trying to build a relation to cross-compare employees and strings.

The reason that this seems problematic is that the problem is being solved inappropriately. Of course making employees and strings comparable is hard. They're fundamentally different abstractions. If you want to force that view, then you should be made to work harder to do it.

I would be very unhappy if the wrong solution leads to concepts weakened to the point of simple syntactic fragments.

Andrew

akrzemi1 commented 8 years ago

Being a clean solution and expressing your intentions clearly is one important goal. Another is not forcing the users to pay a run-time penalty; or offer a solution that is slower that what users can easily do manually.

The problem with Employee is quite trivial, but the other problem (with Itinerary, Leg, Date) is practical, and I have faced it myself. It can be summarized as: what if computing a projection is expensive at run-time, but offering a mixed comparison (which in fact implicitly implements a projection) is cheap? And what if providing a 'common-type' is impossible or im-practical?

tomaszkam commented 8 years ago

Just an observation: a better solution to the original problem be to sort() on the projected employee id instead of trying to build a relation to cross-compare employees and strings.

But the problem that I presented was checking/finding if employeer with given id (it is an input not employee) is present int the collection. I do not have employee object and the actual task is to find it.

asutton commented 8 years ago

Being a clean solution and expressing your intentions clearly is one important goal. Another is not forcing the users to pay a run-time penalty; or offer a solution that is slower that what users can easily do manually.

There is a trivial solution for the Employee problem. It shouldn't be motivating this discussion. The later problem is better.

FWIW, N3351 (now 4 years old) discusses exactly this issue in appendix D and proposes an alternative specification.

My opinion on this particular matter has changed over the past couple of years. I would be in favor of dropping the common type requirements from all of the cross-type concepts. Any overloads or comparators that make a reasonable syntactic claim of interoperability between types should be sufficient.

We can formulate the semantics separately, but they may not be testable -- as you say, when a common type cannot be constructed.

Andrew

asutton commented 8 years ago

I thought that this satisfied the requirements?

ranges::binary_search(ve, std::string("some-id"), std::less<>, &Employee::id);

Andrew Sutton

On Wed, Mar 9, 2016 at 9:39 AM, tomaszkam notifications@github.com wrote:

Just an observation: a better solution to the original problem be to sort() on the projected employee id instead of trying to build a relation to cross-compare employees and strings.

But the problem that I presented was checking/finding if employeer with given id (it is an input not employee) is present int the collection. I do not have employee object and the actual task is to find it.

— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/34#issuecomment-194322732.

tomaszkam commented 8 years ago

I thought that this satisfied the requirements?

And the question is that if we have STL1 code like above (that was the only way to achieve it), do we really need to force users to rewirte it to projection whith migration to STL2. Why the projection need to applied on iterator, instead of integrated inside comprator? Especially if I already have implementation of the later. Reimplementing the projection is larger task that invoking .out() on the result.

I tought that STL2 was intended to be compatible with existing code modulo bugs, and migration should be failry trivial.

tomaszkam commented 8 years ago

And the question is that if we have STL1 code like above (that was the only way to achieve it), do we really need to force users to rewirte it to projection whith migration to STL2. Why the projection need to applied on iterator, instead of integrated inside comprator? Especially if I already have implementation of the later. Reimplementing the projection is larger task that invoking .out() on the result.

I tought that STL2 was intended to be compatible with existing code modulo bugs, and migration should be failry trivial.

In other words, I would like to see an convincing explanation why the code examples I have presented are broken (bugged, unsound) and need to be rewritten, that I and every other programmer could present to their colleagues to explain why we need to rewrite it during migration of our code to STL2.