dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.2k stars 1.57k forks source link

[vm] Use std::unique_ptr for exclusive-ownership resource management #37244

Open mdempsky opened 5 years ago

mdempsky commented 5 years ago

Summary

This is a proposal issue that the Dart VM should adopt using std::unique_ptr for exclusive-ownership resource management. In particular, that VM resources that are managed with new/delete or malloc/free should be migrated to using std::unique_ptr, where possible.

Note: This proposal does not apply to Dart Objects/RawObjects allocated in the GC heap, or to C++ objects allocated in Zones.

Rationale

Using std::unique_ptr to represent exclusive-ownership resource management is a well established C++ pattern that helps to avoid memory leaks and security vulnerabilities like use-after-free.

The Google C++ style guide recommends their use and explains its rationale under "Ownership and Smart Pointers".

The Chromium C++ style guide further articulates explicit recommendations on their use under "Object Ownership and Calling Conventions".

The same is recommended in books like Meyers's "Effective Modern C++", Item 18 "Use std::unique_ptr for exclusive-ownership resource management.":

When you reach for a smart pointer, std::unique_ptr should generally be the one closest at hand. It's reasonable to assume that, by default, std::unique_ptrs are the same size as raw pointers, and for most operations (including dereferencing), they executive exactly the same instructions. This means you can use them even in situations where memory and cycles are tight. If a raw pointer is small enough and fast enough for you, a std::unique_ptr almost certainly is, too.

Forgetting to release memory when it's unneeded results in memory leaks. Dart supports using Clang's Address Sanitizer, which help to detect leaks, but engineering effort still needs to be spent tracking these down. E.g., #37154 was a recent leak detected by ASAN, but that still took several days to fix.

At the other end of the problem, releasing memory prematurely when it's still needed results in "use after free" errors, which is a very common source of security vulnerabilities. E.g., see https://cwe.mitre.org/data/definitions/416.html.

Lastly, software engineering estimates that the number of bugs in a program is proportional to the amount of code (e.g., https://en.wikipedia.org/wiki/Halstead_complexity_measures). Refactoring code so that developers don't have to write delete foo; or free(bar); reduces the number of lines of code, and should thus help reduce bugs. E.g., a53c12d07af05e87e9a24c9fe9ef880dd65e4e4f removed about 80 lines of code.

Concerns

std::unique_ptr is a standard library feature added in C++11, and the Dart VM has traditionally minimized dependencies on the C++ standard library. Therefore it's reasonable to be concerned about platform compatibility, size/memory overhead, and style change cost. Below I respond to each of these concerns.

Compatibility

For platform compatibility, Dart's toolchain is based on Chromium's, and Chromium has supported C++11 and std::unique_ptr in particular since at least 2015 (I can't find an exact date off hand). The Dart SDK has already been using std::unique_ptr for over a year (std::unique_ptr in runtime/bin/gensnapshot.cc: 0cc70c4a7c448078dd2aea8e45c56bc50fbbb31c), the VM itself has been using C++11 language features for almost as long (lambdas in runtime/vm/compiler/frontend/kernel_binary_flowgraph.cc: 184e2a8b4c9bd675a32c172014e789832a676f13), and I landed a CL to use std::unique_ptr within the VM about a month ago that hasn't needed to be reverted (a53c12d07af05e87e9a24c9fe9ef880dd65e4e4f).

Flutter also recently switched to requiring C++14 on all platforms (https://github.com/flutter/buildroot/commit/50e07b839afd81dab05e7d77168dbd1a553fc13b), though Google's internal Dart customers still require C++11 compatibility (#35173). Consequently, for now, we can't use C++14 features like std::make_unique. (Though we could consider using absl::make_unique https://abseil.io/tips/126 or writing our own wrapper.)

Overhead

As mentioned in the Scott Meyers quote above, std::unique_ptr<T> compiles down very similarly to T*. It doesn't involve any additional memory overhead, and it compiles to the same or similar instructions.

It's worth noting though that it does change calling conventions slightly. Whereas raw pointers can be passed via registers, the C++ ABIs require passing std::unique_ptr on the stack, which results in slightly larger program text.

At https://dart-review.googlesource.com/c/sdk/+/101222/2#message-9c02772aaf98061ddcd08b0dba31ece6b7235a00, I measured the impact for a +327,-403 LOC diff changing Message* to std::unique_ptr<Message> to add about ~2.5kB of text on X64, and ~1kB on ARM. This seems within the noise of most typical development churn. For example,
0707abb is another CL that refactored some code to use std::unique_ptr, and it shrunk text by about 500 bytes on both X64 and ARM.

Style change

This is a style change for Dart, but it's adopting coding conventions that are already commonly familiar to C++ engineers. This should make the code base more accessible to engineers familiar with Chromium, Flutter, and other Google C++ projects.

Migration plan

The plan I had in mind when I submitted a53c12d07af05e87e9a24c9fe9ef880dd65e4e4f and wrote/mailed CL 101740 was to look for types that are managed using new/delete (and eventually malloc/free) and replace all uses of T* with std::unique_ptr<T>.

By changing one T at a time, it allows consistently updating all of the APIs throughout the VM and making sure idiomatic ownership conventions are followed. This also makes it easier for multiple engineers to work on converting separate types in parallel.

On the other hand, it's likely the changes each have to touch a larger number of files at a time, which might increase the number of patch conflicts. I anticipate we can be flexible about the particular migration strategy and adapt as appropriate for each CL.

mdempsky commented 5 years ago

The use of std::unique_ptr was also discussed on https://dart-review.googlesource.com/c/sdk/+/101740. Copying over some of the feedback from there for easy reference:

@zanderso:

Unique pointers don't have any runtime overhead, and are pretty standard across flutter engine and Fuchsia for cases like this, so it's arguable that there should be an effort to use them where applicable in the Dart VM as well.

As Siva says though, they are not currently in agreement with Dart VM style, so spot-fixes rather than a wider effort might create confusion.

It's probably a good idea to solicit more thoughts on this from more Dart VM team folks. I've added a couple more to this review to discuss.

@mraleph:

I would vote for permitting std::unique_ptr in Dart VM code base for the following reasons:

  • Google style guide recommends unique_ptr;
  • our major embedders all use unique_ptr;
  • they are significant improvement over manual memory management with explicit delete.
  • we already have instances of unique_ptr in our code base.

@mkustermann:

I also vote for allowing std::unique_ptr - it makes code less error prone.

For cases where size/performance impact is not acceptable we can continue to use normal pointers.

If we use it pervasively throughout our code base, we could consider aliasing unique_ptr in the dart:: namespace to make it less verbose (no strong opinion)

@alexmarkov:

+1 to std::unique_ptr

zanderso commented 5 years ago

SGTM. Thanks for writing this up!