CAIDA / bgpstream

BGP measurement analysis for the masses
GNU General Public License v2.0
109 stars 43 forks source link

Consolidate dict add call and remove possible memory leak #33

Closed rootcanal closed 7 years ago

rootcanal commented 7 years ago

Enjoying toying around with this new library!

Just a small change here in this pr to fix a possible leak. Hiding a return inside of a do-while macro is a recipe for danger, as evidenced by the potential memory leak hidden in ADD_TO_DICT. Regardless of whether we fail to SetItem, we should be decrementing the key and values to avoid leaking them.

alistairking commented 7 years ago

Thanks!

I'm not sure that a leak is possible in this situation since IIRC (and I could be totally wrong), when NULL is returned from a python binding it is treated as a fatal error and everything is aborted (i.e. it it not simply a failed call or exception). Have you actually seen this leak memory?

rootcanal commented 7 years ago

I believe that you are mistaken regarding NULL returns within a python binding. From the documentation:

An important convention throughout the Python interpreter is the following: when a function fails, it should set an exception condition and return an error value (usually a NULL pointer).

Basically, by returning NULL from ADD_TO_DICT, we are relying on the PyDict_SetItem() call to properly set the error string and what not, which it does. But our NULL return results in a raise, not an abort. So any objects not cleaned up will be leaked.

This is pretty straightforward to test. Since my workspace already has my change I've added to it, the following is applied against that, but if desired I could do the same test with ADD_TO_DICT and see it. Essentially I just force an unconditional return NULL regardless of the PyDict_SetItem() return value, and neglect to perform the decrement. As you can see, the return NULL results in python raising a SystemError (since no better error string has been set) which can be caught like any other.

[rootcanal@brisket pybgpstream]$ cat src/ifcs.py from _pybgpstream import BGPStream

stream = BGPStream() print(stream.get_data_interfaces()) [rootcanal@brisket pybgpstream]$ valgrind --leak-check=full python src/ifcs.py ... (0 bytes definitely lost)

[rootcanal@brisket pybgpstream]$ git diff diff --git a/pybgpstream/src/_pybgpstream_bgpstream.c b/pybgpstream/src/_pybgpstream_bgpstream.c index 09a30e9..bb3e643 100644 --- a/pybgpstream/src/_pybgpstream_bgpstream.c +++ b/pybgpstream/src/_pybgpstream_bgpstream.c @@ -191,6 +191,7 @@ BGPStream_get_data_interfaces(BGPStreamObject *self) !add_to_dict(dict, "description", PYSTR_FROMSTR(info->description))) { return NULL; }

[rootcanal@brisket pybgpstream]$ python src/ifcs.py Traceback (most recent call last): File "src/ifcs.py", line 4, in print(stream.get_data_interfaces()) SystemError: error return without exception set [rootcanal@brisket pybgpstream]$ valgrind --leak-check=full python src/ifcs.py ... ==12960== definitely lost: 280 bytes in 1 blocks

alistairking commented 7 years ago

Gotcha. Thanks for clarifying. And thanks for catching the issue in the first place.

I'm fine with your changes, with the exception of the use of stdbool.h. For the sake of consistency with the rest of the codebase, can you instead return an int (the usual 0 on success, something else on failure)?

alistairking commented 7 years ago

sigh. i just looked and I see that stdbool has crept in in a handful of places. well, if you wouldn't remind removing it from here, that would be much appreciated!

rootcanal commented 7 years ago

done and done 👍

alistairking commented 7 years ago

Super. Thanks a lot!