AVSLab / basilisk

Astrodynamics simulation framework
https://hanspeterschaub.info/basilisk
ISC License
139 stars 60 forks source link

Transition to smart pointers #282

Open juan-g-bonilla opened 1 year ago

juan-g-bonilla commented 1 year ago

Describe your use case Currently, the codebase relies heavily on raw pointers and occasionally C-style arrays. There are many reasons why using these is a bad idea (summarized from "Effective Modern C++" by Scott Meyers):

  1. A raw pointer does not indicate whether it points to a single object or an array
  2. It's declaration does not reveal whether you should destroy the object it points to when you are done using the pointer (i.e. whether the pointer "owns" the object it points to).
  3. It is not obvious whether one should use delete, delete[], or even some other custom destruction mechanism.
  4. Difficult to ensure that destruction happens only once, along every code path, including exceptions.
  5. No way to tell whether a pointer dangles, which prevents writing sanity checks that avoid dereferencing a dangling pointer (undefined behaviour).

Smart pointers and std::array stand to solve these problems:

  1. There is never ambiguity between arrays and single objects. operator[] is disallowed for std::shared_ptr and std::unique_ptr.
  2. std::shared_ptr and std::unique_ptr handle the destruction of the object.
  3. Again, smart pointers handle destruction, and you can even provide custom destruction functions.
  4. Idem
  5. Existance of the smart pointer implies existance of the underlying object. std::weak_ptr supports dangling checks and will throw nice exceptions when trying to manipulate them.

Morever, the fact that we use SWIG makes raw pointers even more brittle. Because of how memory is handled by SWIG, an object is either owned by the Python layer or the C++ layer (by default, Python owns the object). This means that, as soon as Python objects go out of scope in Python, the C++ object is removed, and any pointer left around is dangling. This means that, for the duration of the simulation, all Python objects must be kept somehow alive.

Currently, modules and tasks are kept alive by adding them to lists in SimBaseClass and TaskBaseClass. On the other hand, when planets are created in gravBodyFactory through methods like createEarth, these methods explicitely return ownership of the memory to the C++ layer through earth.this.disown(). This allows the Python GravBodyData to go out of scope in Python, but it may be causing memory leaks (I haven't found any place where these pointers are deleted).

Integrators, and possibly many other objects, have no built-in mechanism to keep them alive. This means that the following code will fail:

scObject.setIntegrator( svIntegrators.svIntegratorRK4(scObject) )

Because as soon as the function returns, the integrator object will be deleted. This can be an enormous source of confusion for our users, especially those not familiar with memory management. Moreover, these errors can be hard to debug, as dangling pointers are undefined behaviour.

Smart pointers are not a perfect solution to the complex memory management of SWIG, but they do provide some advantages. By using shared_ptr, memory ownership is shared by the C++ and Python layers. This means that an object handled by a smart_ptr can go out of scope in Python without the C++ object disappearing, thus allowing the following code:

scObject.setIntegrator( svIntegrators.svIntegratorRK4(scObject) )

Note that the use of shared_ptr requires use of some SWIG features that imply widespread changes to our SWIG files and will very likely break user code that is implementing their own C++ code.

See branch 282-smart-pointer-dynamic-object for an example of how one could move from raw pointers to shared_ptr. Note that this is just a prototyping branch, there might be other ways to use smart pointers with SWIG that are better than what I came up with in this implementation.

Describe alternatives solutions you've considered We could stay with raw pointers and develop a strict protocol for how to handle these objects. Special attention should be given to those pointers that interface with SWIG to prevent dangling pointers in C++ but also memory leaks.

Additional context The branch showcased before makes use of the %import SWIG feature. Currently, by always including every header file in our SWIG interface files, where are forcing SWIG to generate Python wrappers for every single module it generates. By using %import, we can link modules together, which removes code duplication. This lowers the memory requirements of Basilisk and prevents some unintuitive behaviour:

>>> from Basilisk.simulation import dragDynamicEffector, extForceTorque 
>>> isinstance(dragDynamicEffector.DragDynamicEffector(), dragDynamicEffector.SysModel)
True
>>> isinstance(dragDynamicEffector.DragDynamicEffector(), extForceTorque.SysModel)
False

Most importantly, the use of %import allows us to perform all SWIG changes to a a class in a single file (such as the use of shared_ptr), and then have these changes apply to all other SWIG interfaces.

Systems that use raw pointers (to be expanded)

juan-g-bonilla commented 1 year ago

Tagging @joaogvcarneiro @patkenneally @schaubh

juan-g-bonilla commented 1 year ago

Issue #295 is another example of memory management issues related to raw pointers. In that case, a ReadFunctor had a pointer to the payload of a Message declared in Python that goes out of scope too early, which causes the Message (and payload) to be destroyed. Making the payload a shared_ptr would allow readers to refer to them even when the Message has been destroyed. However, C modules present a problem.