ptesarik / libkdumpfile

Kernel coredump file access
Other
22 stars 22 forks source link

Inconsistent ownership semantics of kdump_set_attr #81

Open brenns10 opened 1 month ago

brenns10 commented 1 month ago

Hi Petr,

Omar & I were looking at kdump_set_attr(ctx, "linux.vmcoreinfo.raw", &attr) and trying to determine the correct error handling behavior. See this thread on my PR. The original context there is that I included an extra free() which was definitely wrong on my part. But that led to the following question:

When kdump_set_attr() fails, does the value need to be decref'd? There seem to be two cases:

  1. A failure occurs inside of set_attr() which says that any failure also discards and decref's the new value.
  2. But, if the failure occurs outside of set_attr(), for example here in kdump_set_attr() or here in check_set_attr(), then the attribute would be left untouched.

So either we leak the attribute in an error or we double-free / underflow the reference count. Given those choices, I tend to prefer the leak, but ideally we wouldn't need to choose. Does my analysis look right here? If so, I'm happy to send a PR or if you prefer to fix it up yourself, that's fine as well.

ptesarik commented 1 month ago

Hi Brandon,

FWIW the object model in libkdumpfile is half-baked, unfortunately. Refcounting was added when it became necessary for the Python bindings, and it may not work properly. The goal is to make the API easy to use, and this is why kdump_set_attr() takes over ownership, so the caller does not have to care about the reference. Now, keeping ease of use in mind as a goal, if an API call fails, the caller most often propagates the error, and they will be happy if they don't have to care about the reference. It is much less common to try an API call, and if it fails, do something else with the same object. Such uses should take an extra reference.

In short, leaking a reference on failure is a bug. Thank you for a couple of places where that happens. I'll have a look at this tomorrow (European time).

ptesarik commented 1 month ago

Status update: I haven't forgotten about this issue, but it turns out that references are leaked in many more places. I'm trying to come up with a fix that would not clutter the sources with excessive decref's.