cppalliance / safe-cpp

Boost Software License 1.0
92 stars 9 forks source link

Initialisation behaves differently in an unsafe function #51

Open Dalzhim opened 6 days ago

Dalzhim commented 6 days ago

The following program is currently ill-formed. I believe it shouldn't be. In order to make incremental adoption easy, adding #feature on safety at the top of a preexisting valid c++ program shouldn't make it ill-formed. This way, I can enable safety in a 20k lines cpp files and gradually start creating safe functions and types without changing everything all at once.

#feature on safety

#include <string>
#include <vector>

int main()
{
    std::vector<std::string> vec;
    vec.push_back(std::string{"A string"}); // Error, `vec` is considered uninitialized
}

Here are some excerpts from the current draft that contradict my intuition that adding #feature on safety on an existing valid C++ program should keep compiling the same:

§ 2.3 Explicit mutation […]

struct Obj {
  const int& func() const;   // #1
        int& func();         // #2
};

void func(Obj obj) {
  // In ISO C++, calls overload #2.
  // In Safe C++, calls overload #1.
  obj.func();

In order to solve this contradiction, I would expect the above to hold by adding safe to the func function.

seanbaxter commented 4 days ago

The [safety] features enables the new ownership object relocation. Your old code breaks under [safety] because the old code uses the legacy object model, which is a lot different. If you want to harden existing code with the [safety] feature you'll have to adapt it to the new object model. The new object model is significantly different because, due to relocation, an object's lexical scope does not indicate when it's initialized. If you rel out an object, it's uninitialized until you assign back into it.

Long answer short: use explicit initialization.

Dalzhim commented 4 days ago

Thank you for the explanation. One thing that's still not clear to me is why wouldn't the legacy declaration T t; be considered a valid initialization as it used to be. I can understand why int i; is not a valid initialization (it just isn't in the legacy model as well), but for a class type, it should be valid in order to minimize the burden of migrating existing code towards safety.

This new object model is now also having me wondering what happens when a class is defined over multiple different files. For example, what if the class definition is in a file with safety on, and the member function definitions are in a file where safety isn't on. And what about the opposite case, and hybrid cases. I'll toy around with these ideas using compiler explorer to figure that out.

seanbaxter commented 4 days ago

My view on initialization is that it should be explicit. int i has a default initializer (it just happens to be trivial in C++) and something like std2::box<> does not have a default initializer. A declaration like std2::box<T> box; would be ill-formed if I tried to initialize on declaration. I could perform overload resolution on std2::box<T> and do default initialization if it has a viable default constructor, and otherwise keep it uninitialized and wait for an assignment into it. But now that is something that has to be performed during instantiation, because it's T-dependent, and for some values of T it could have a default constructor and for other values not have one.

There would be a lot of intelligence going into determining if something is default initialized or not, and the user can't know at definition (again, thanks to specialization). Simply requiring explicit initialization is the only rule with a clear meaning.

As far as #feature on safety, if a class is declared with [safety], all its member functions are too, and out-of-line implementations of the member functions should also implement [safety]. But that's not saying the compiler is doing this--this project has to specify these behavior and include unit tests to make sure they're implemented. Don't look for the compiler to be authoritative on these things. The implementation is focused on demonstrating the memory safety features. It's not complete, especially on issues of flagging misuse.