wlav / cppyy

Other
400 stars 41 forks source link

Creating a list from a vector of base pointers may be invalid #66

Closed mkolin closed 2 years ago

mkolin commented 2 years ago

The issue comes up when a function returns a vector with base pointers and that is converted to a list. The conversion from a temporary vector will seem like it will construct a list of invalid pointers.

Note keeping a reference to the vector will prevent any issues. When this happens, destructing the underlying vector will invalidate the list.

Example:

import cppyy

cppyy.cppdef('''
struct Base { std::string name; explicit Base(const std::string& name) : name(name) { }};
struct Derived : Base { explicit Derived(const std::string& name) : Base(name) { } };
struct Owner {
  Derived d1 { "d1" };
  Derived d2 { "d2" };
  std::vector<const Base*> GetVector() { return { &d1, &d2 }; }
};
''')

from cppyy.gbl import Base, Derived, Owner

o = Owner()

print('When we get a vector, the base pointers look good:')
print(f'  - Vector {o.GetVector()} ({o.GetVector()[0]}, {o.GetVector()[1]})')
print(f'  - Element 1 {o.GetVector()[0].name}')

v = o.GetVector()
print('Keeping a vector reference will fix the issue:')
print(f'  - Vector {list(v)}')
print(f'  - Element 1 {list(v)[0].name}')

print('But creating a list from the return value produces incorrect pointers and crash if you use anything in the vector')
print(f'  - Vector {list(o.GetVector())}')
# this line crashes
#print(f'  - Element 1 {list(o.GetVector())[0].name}')

l = list(v)
v.__destruct__()
print('Also, destructing the underlying vector a list is constructed from crashes')
print(f'  - Vector {l}')
# this line crashes
print(f'  - Element 1 {l[0].name}')
When we get a vector, the base pointers look good:
  - Vector <cppyy.gbl.std.vector<const Base*> object at 0x563589c322d0> (<cppyy.gbl.Base object at 0x563589c1e8e0>, <cppyy.gbl.Base object at 0x563589c1e900>)
  - Element 1 d1
Keeping a vector reference will fix the issue:
  - Vector [<cppyy.gbl.Base* object at 0x563589c1e8e0>, <cppyy.gbl.Base* object at 0x563589c1e900>]
  - Element 1 d1
But creating a list from the return value produces incorrect pointers and crash if you use anything in the vector
  - Vector [<cppyy.gbl.Base* object at 0x56358a032f80>, <cppyy.gbl.Base* object at 0x563587e5d010>]
Also, destructing the underlying vector a list is constructed from crashes
  - Vector [<cppyy.gbl.Base* object at 0x563588428960>, <cppyy.gbl.Base* object at 0x563587e5d010>]
Traceback (most recent call last):
  File "test.py", line 36, in <module>
    print(f'  - Element 1 {l[0].name}')
MemoryError

This functionality worked in 1.6.1.

wlav commented 2 years ago

Technically speaking, the new behavior is correct: the return type from indexing the vector is const Base*&, so a reference to the pointer, not a pointer by-value. When the temporary goes out of scope, so do the references to its pointers. Let me dig a bit.

wlav commented 2 years ago

Fixed in repo by distinguishing the behaviors between const and non-const. It fixes this report and doesn't break any tests, but there may still be corner cases lurking.

mkolin commented 2 years ago

Awesome! Thanks for the quick turn around. A great library with great support.

wlav commented 2 years ago

Released with 2.4.0 and its dependencies.