jabacat / jml

JABACAT-created machine learning library from scratch.
5 stars 5 forks source link

Add a proper way to recycle vector slots in the C API #41

Open adamhutchings opened 1 year ago

adamhutchings commented 1 year ago

Right now, when a vector is disposed of through the C API, its slots in the vectors array just keeps wasting memory. We want it to be the case that when a new vector is made, it takes up the first free spot in the array if there is any and only extends the array if it needs to.

Sophon96 commented 1 year ago

Something we should consider is whether this is a good idea or not; recycling slots means that erroneous client programs can have harder-to-diagnose bugs. For example, suppose a program using the C API

  1. Creates a jml::Vector with some parameters
  2. Eventually deletes the previous jml::Vector, but erroneously holds onto the index (essentially a use-after-free). The slot is now empty, and the next jml::Vector will occupy it
  3. Creates a new jml::Vector with different parameters. The old slot is now occupied by this new jml::Vector
  4. Tries to modify the old, destroyed jml::Vector (UAF)

This will end up modifying the new jml::Vector, which may work without error with some methods but will inexplicably fail or fail with seemingly unrelated errors when using other operators.

adamhutchings commented 1 year ago

This is true, but is also essentially the C tradeoff between safety and efficiency. In my mind, this method offers the possibility of UAF bugs, as opposed to the certainty of excess memory usage by unused vectors with the non-free option. Thoughts?