marco-langer commented 2 years ago

Description

Fixes #561

Invoking image::allocate_a with an empty dimension (i.e. zero bytes to allocate) causes a memory leak on implementations which return a non-nullptr for zero allocated bytes. In this case image::deallocate will not invoke the deallocation method on the allocator.

Examples, for which image::allocate_a tries to allocate zero bytes:

boost::gil::rgb8_pixel_t pixel(42);
boost::gil::rgb8_image_t image(0, 0, pixel);

boost::gil::rgb8_image_t image1;
boost::gil::rgb8_image_t image2(image1);

In the above mentioned issue the leak happens during the call of read_image overloaded for any_image: An image is copy-constructed from a default-constructed image. The other overloads of read_image do not leak memory.

The issue is not related to the jpeg io extension, it occurs as well with the png extension in combination with any_image.

References

702

Tasklist

[x] Add test case(s)
[x] Ensure all CI builds pass
[x] Review and approve

codecov[bot] commented 2 years ago

Codecov Report

Merging #649 (323a621) into develop (adddbec) will decrease coverage by 0.02%. The diff coverage is 50.00%.

@@             Coverage Diff             @@
##           develop     #649      +/-   ##
===========================================
- Coverage    80.69%   80.67%   -0.03%     
===========================================
  Files          116      116              
  Lines         5072     5076       +4     
===========================================
+ Hits          4093     4095       +2     
- Misses         979      981       +2

striezel commented 2 years ago

These might be silly questions, but I will ask anyway:

How can it be verified that the leak is fixed?
And once that is clear: Shouldn't there be a regression test to make sure the leak does not get reintroduced in the future?

marco-langer commented 2 years ago

To observe the memory leak in the current develop branch, run either this minimal example, or the example in the linked issue, with enabled address sanitizer:

#include <boost/gil.hpp>

namespace gil = boost::gil;

int main()
{
    gil::rgb32_image_t image(0, 0);
    return 0;
}

In Linux, just add -fsanitize=address to the compile flags. In Windows, MSVC supports Asan since Visual Studio 2019.

Running the same examples against my branch from this PR do not show the leak anymore.

The second question about regression testing is a valid concern. I am not very familiar with the boost unit testing framework, but my naive suggestion is to add a dedicated Asan-enabled job to our CI matrix and run the complete test suite with Asan enabled. It would be a good idea to check how / if other boost libraries also include Asan in their CI, does anyone know?

sdebionne commented 2 years ago

Thanks for reporting this corner case. The memory leak can we reproduced with

#include <memory>

int main() {
  std::allocator<char> alloc;
  alloc.allocate(0);
}

Zero size is a valid size and new/ std::allocator are required to return a non nullptr (that needs to be deleted). A better fix IMO is:

diff --git a/include/boost/gil/image.hpp b/include/boost/gil/image.hpp
index 23763d412..d4ffb7725 100644
--- a/include/boost/gil/image.hpp
+++ b/include/boost/gil/image.hpp
@@ -392,7 +392,7 @@ private:

     void deallocate()
     {
-        if (_memory && _allocated_bytes > 0)
+        if (_memory)
             _alloc.deallocate(_memory, _allocated_bytes);
     }

marco-langer commented 2 years ago

You are right, there are two possible solutions to avoid this bug: either not allocating at all zero bytes (as done in my PR), or allocating zero bytes and freeing this data pointer in the destructor.

In this PR I decided to bail out early in the constructor in order to not allocate zero bytes. The non-nullptr returned by the allocator shall not be dereferenced according to the standard. Quoting the behaviour of malloc, which might - or might not (implementation defined) - be used internally by std::allocator:

The C standard (C17 7.22.3/1) says:

If the size of the space requested is zero, the behavior is implementation defined: either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.

What is the point about carrying such a "toxic" pointer from the point of construction until the point of destruction and risking the chance of invoking undefined behaviour while the image instance exists? Wouldn't it be better to keep the data pointer as nullptr during object lifetime?

But of cause I can change my PR accordingly if you still think it is a good idea.

Edit: Here on stack overflow is the behaviour of operator new explained. Deferencing the pointer returned from new is undefined behaviour in C++ as well as it is in C.

mloskot commented 2 years ago

@marco-langer

What is the point about carrying such a "toxic" pointer from the point of construction until the point of destruction and risking the chance of invoking undefined behaviour while the image instance exists?

A very good question. I've taken this from Scott Meyers' "Effective C++"

"C++ requires that operator new return a legitimate pointer even when zero bytes are requested. (Requiring this odd-sounding behavior simplifies things elsewhere in the language.)"

Ensuring a non-null pointer invariant makes things simpler w.r.t. expected state of object. But, it can bite with the undefined behaviour indeed.

Could you update this PR with the latest develop (either merge or rebase & force-push is fine), so we can see if the CI jobs succeed eventually?

marco-langer commented 2 years ago

I usually learn a lot by reading Scott Meyers, but I think this specific quote is not worded correctly: The pointer returned from allocating zero bytes is not legitimate as it shall not be dereferenced according to the standard (please correct me if I am wrong).

Thereby the invariant is broken anyway, because the invariant of a non-null pointer is that one can safely dereference it. The consequence of both solutions, either carrying a null pointer or this non-null-but-not-dereferencable pointer during image lifetime, is that the data pointer shall not be derefenced.

I would argue that the risk of accidentally dereferencing a null pointer is less than the risk of derefencing a non-null-but-not-dereferencable pointer. The former can easily be checked via ìf (ptr), whereas the latter requires the additional context information that the pointer was obtained from a call to new T[0].

mloskot commented 2 years ago

@marco-langer

I would argue that the risk of accidentally dereferencing a null pointer is less than the risk of derefencing a non-null-but-not-dereferencable pointer.

I agree.

If @sdebionne does not object with any insights that we have been missing, then I think we should take your PR as is.

sdebionne commented 2 years ago

The codecov failure is interesting: adding a new code path (early return statement) but no test case results in pipeline failure. The other option does not have this drawback.

mloskot commented 2 years ago

The codecov failure is interesting: adding a new code path (early return statement) but no test case results in pipeline failure.

Yes, I've been just wondering what's happening there...

boostorg / gil

fix: Memory leak in image class for empty dimensions #649

Description

References

702

Tasklist

Codecov Report