Comparison with the C++ Core Guidelines

GunpowderGuy commented 1 year ago

I think the following is not true "To illustrate, consider the type declaration below, which follows the C++ Core Guidelines. The default rules for identifying owners do not apply in this example, requiring the a user-defined annotation to declare Matrix3 as an owner type of double" using c arrays does not follow the c++ guidelines, if not explicitly at least because it doesn't allow upholding other core rules. the idiomatic choice, std::array, is a container and as such matrix3 would automatically be an owner

kyouko-taiga commented 1 year ago

I don't think there's a ban on C-style arrays. I found multiple examples of such arrays directly in the guidelines.

The point of the statement is to show that automatic deduction of owner types cannot always apply, thus requiring user annotations. That could be because we're using C-style arrays or because of another reason. Regardless, the core guidelines do not offer a clear and unambiguous way to identify whole/part relationships; the detection logic is ad-hoc.

GunpowderGuy commented 1 year ago

Could you provide examples where part/whole relationships cant be easily identified ( read : at all or needing anotations ) in modern code bound by the c++ core guidelines? I think the problem might be they werent put into practice ( at least until recently ) so some clarifications such as banning old arrays werent added

kyouko-taiga commented 1 year ago

Well, the core issue (pun intended) is that without a clear and unambiguous specification of what is and isn't allowed, it is very difficult to draw a complete picture and we're left with a guessing game. The paper is written w.r.t. what the CG actually specify, not what they might become or the way they may be used in practice. A "formal" specification is binding, common wisdom isn't.

Perhaps we'll be able to redraw the comparison between the Val model and the CG once they get updated. To the best of my knowledge, the latest version doesn't specify the necessary restrictions to identify owners automatically in the subset of the language they accept. Among other things, C-style arrays are not banned, global variables are not banned (although discouraged), mutable is not banned, etc.

The CG give us this definition:

The following standard or other types are treated as-if annotated as Owners, if not otherwise annotated and if not SharedOwners:

Every type that satisfies the standard Container requirements and has a user-provided destructor. (Example: vector.) DerefType is ::value_type.

Every type that provides unary * and has a user-provided destructor. (Example: unique_ptr.) DerefType is the ref-unqualified return type of operator*.

Every type that has a data member or public base class of an Owner type.

I believe we can automatically identify whole/parts relationships with these rules if and only we ban every feature that may cause reference semantics. That includes at least pointers, references, global variables, and static variables. To satisfy the immutability guarantees required to prove freedom from invalidation, we also need to ban mutating fields. It's possible I'm missing some or many things. I am not a C++ expert by any measure.

GunpowderGuy commented 1 year ago

In the context of the Val Object Model discussion, I'd like to mention that pure cppfront code addresses some of the issues raised and restrains certain features. For example, it tracks invalidated unique_ptrs to prevent their use, effectively implementing destructive moves. Additionally, raw pointers remain present but don't own the objects they point to, which helps avoid potential problems.

It's also worth noting that the C++ Core Guidelines include a series of related papers that lay the foundation for C++ syntax 2. Interestingly, one of these papers introduces parameter passing options similar to Val, such as "in", "inout", and others. When combined, these guidelines and papers appear to adhere to the principles of the Val Object Model, offering a different approach to achieving independence and efficiency in modern programming languages.

My motivation in participating in this discussion is two-fold:

Cppfront, the C++ dialect that compiles to standard C++, could potentially benefit from the Val mental model as a teaching tool for programmers, or it could be modified to align with the Val model for the sake of achieving complete safety, which, as mentioned in the paper, was not the primary goal of C++ lifetimes. Besides mutable value semantics have the aforementioned ease of understanding.

Gaining a better understanding of how safe C++ code is structured would enable Val to compile to modern idiomatic C++ or facilitate interoperability with it. Val currently compiles to C++, but the creators believe that the long-term practicality of this approach may be limited due to the significant differences between the two languages.

https://github.com/val-lang/val-lang.github.io/discussions/57 the cppfront repo has a lot more papers than the orignal c++ core guideliness

kyouko-taiga commented 1 year ago

but the creators believe that the long-term practicality of this approach may be limited due to the significant differences between the two languages.

I think one important difference relates to destructive move. Note that I'm not convinced tracking invalidated unique pointers is sufficient to implement destructive move semantics. For example:

type A: Linear {
  public init() { print("init") }
  public fun take_value(from other: sink A) {
    inout { print("move-assign") } // <- move-assignment
    set { print("move-init") } // <- move-initialization
  }
}

fun use(_ a: inout A) {}

public fun main() {
  var x = A() // init
  var y = x   // move-init
  // <- note: `x` is completely gone; using it is a static error
  &y = A()    // init; move-assign
  use(&y)
  // <- note: destruction of `y` may occur anywhere from here
  &x = A()    // init; move-init
  use(&x)
  // <- note: destruction of `x` may occur anywhere from here
}

Crucially, the initialization of y does not leave an "empty shell" in x. Its value is gone for good. So while &y = A() causes a move-assignment, &x = A() causes a move initialization, as though x never had any value.

AFAICT, replicating this semantics in C++ will require to insert explicit custom destruction calls in the middle of function and to have types define a special value to represent moved values.

Other known hurdles are described in this document.

GunpowderGuy commented 1 year ago

I see what you mean. There seems to be a discussion around cppfront to implement destructive moves themselves until they can convince c++ to add them properly. Which may happen sooner than latter : https://quuxplusone.github.io/blog/2023/02/17/issaquah-status/#trivial-relocatability

Thanks for the document, i totally missed it and thinking solutions for the problems explained there is very interesting

hylo-lang / wg21

Comparison with the C++ Core Guidelines #2