Closed treapster closed 4 months ago
Yes, I know this, but this is because of debug build trying to print something that is not accessible. In release this is not a problem.
Is this when trying to switch the view from the top bar buttons?
Is this when trying to switch the view from the top bar buttons?
Yes
Yes, I know this, but this is because of debug build trying to print something that is not accessible. In release this is not a problem.
I built with -DCMAKE_BUILD_TYPE=Release
and i get the same problem. Although even if i did not, it would be even worse.
Actually I cannot reproduce. O thought maybe building with clang, but no. Ubuntu 22.04.3 LTS Clang: 14.0.0
Actually I cannot reproduce.
You built with -DCMAKE_CXX_FLAGS="-D_GLIBCXX_DEBUG" -DCMAKE_BUILD_TYPE=Release
and did not get the crash?
O thought maybe building with clang, but no. Ubuntu 22.04.3 LTS Clang: 14.0.0
Out of curiosity i tried building with gcc before and i get the same issue there
On what controls are you pressing and this is happening?
On what controls are you pressing and this is happening?
I just open geonkick and press "controls" in the top bar and it crashes right away, before it even renders contols themselves. Probably you have to select a kit first, in my case the selected kit is Unfa-Tutorial.
Important clarification: it only happens in ardour and not in standalone mode, and it does not even reproduce in another session in ardour! Seems like it happens only with specific "spoiled" session...
@treapster I even tried with Windows/Ardour... it is ok.
@treapster Debian/Adour is ok too.. debug build
I wonder if you'll be able to repro it with the session. The existing geonkick instance in this session is broken, but the new ones seem to work... session.zip
Okay, i figured how to reliably reproduce: build in release mode with -D_GLIBCXX_ASSERTIONS
or without any flags, create an empty session, add geonkick instance with some drumkit. Save, close, rebuild with -D_GLIBCXX_DEBUG
, open session, try to open Controls view and it breaks. It's difficult to say whether it indicates an actual issue or it's a side effect of GLIBCXX_DEBUG, because here is what gcc doc says:
Note that this flag changes the sizes and behavior of standard class templates such as std::vector, and therefore you can only link code compiled with debug mode and code compiled without debug mode if no instantiation of a container is passed between the two translation units.
If some std struct(not necessarily a vector) gets serialized in raw form during session saving, then it is may be just incompatibility of GLIBCXX_DEBUG with regular build. Or it may be passed around to non-CXX_DEBUG code, though it is unlikely because it works either way with new sessions. Maybe i really shouldn't use GLIBCXX_DEBUG, or maybe there is in fact some invalid state only detected in debug mode, i dont know:)
@treapster The Geonkick saves the UI view state into the plugin state as json format. When it loads back... it it might load incorrectly. Try to comment this
//if (m.name == "UiSettings" && m.value.IsObject())
// uiSettings->fromJsonObject(m.val
in the function
GeonkickApi::setState(const std::string &data)
{
rapidjson::Document document;
document.Parse(data.c_str());
if (!document.IsObject())
return;
for (const auto &m: document.GetObject()) {
//if (m.name == "UiSettings" && m.value.IsObject())
// uiSettings->fromJsonObject(m.value);
if (m.name == "KitState" && m.value.IsObject()) {
auto kitState = std::make_unique<KitState>();
kitState->fromJsonObject(m.value);
setKitState(std::move(kitState));
}
}
}
Okay, i figured how to reliably reproduce: build in release mode with
-D_GLIBCXX_ASSERTIONS
or without any flags, create an empty session, add geonkick instance with some drumkit. Save, close, rebuild with-D_GLIBCXX_DEBUG
, open session, try to open Controls view and it breaks. It's difficult to say whether it indicates an actual issue or it's a side effect of GLIBCXX_DEBUG, because here is what gcc doc says
I tried this, I cannot reproduce on my end. Also your session is ok too on my end. :)... could you attach the Geonkick plugin binaries?
I tried this, I cannot reproduce on my end. Also your session is ok too on my end. :)... could you attach the Geonkick plugin binaries?
Wow, what a flaky bug is that! Here are the binaries, built on manjaro with clang 16.0.6/libc 2.38/libstdc++ 13.2.1 geonkick.zip
I missed your comment about setState function, but I can say that it is not about GLIBCXX_DEBUG incompatibility: i built the plugin with
cmake -DCMAKE_CXX_FLAGS="-D_GLIBCXX_ASSERTIONS -stdlib=libc++" -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++
And now it just segfaults when choosing e.g AVL drumkit from presets, just after launch in standalone mode. The RkObject::RkObjectImpl::inf_ptr->o_ptr is null if i check in gdb.
Gdb trace:
Here is what happens:
RkObject
runs and RkObject is destroyedo_ptr
is destroyed, it is of type unique_ptr<RkObjectImpl>
I tried removing the lines:
--- a/src/redkite/src/RkObjectImpl.cpp
+++ b/src/redkite/src/RkObjectImpl.cpp
@@ -51,9 +51,6 @@ RkObject::RkObjectImpl::~RkObjectImpl()
}
observersList.clear();
- // Remove myself from the paren object.
- if (inf_ptr->parent())
- inf_ptr->parent()->removeChild(inf_ptr);
}
But it does not help much: it apparently fixes standalone mode(edit: not quite, now segfault happens when you close the window, also because of destructors doing something naughty), but in ardour it causes a different crash(also on window close) so weird that i don't even get a stacktrace:
фев 14 04:06:51 sasus kernel: ArdourGUI[5771]: segfault at 111 ip 0000000000000111 sp 00007ffd2f533da8 error 14 in ardour-8.2.0[55fb67a79000+369000]
фев 14 04:06:51 sasus kernel: Code: Unable to access opcode bytes at RIP 0xe7.
фев 14 04:06:51 sasus systemd[1]: Created slice Slice /system/systemd-coredump.
фев 14 04:06:51 sasus systemd[1]: Started Process Core Dump (PID 5963/UID 0).
фев 14 04:06:54 sasus systemd-coredump[5964]: [🡕] Process 5771 (ArdourGUI) of user 1000 dumped core.
Stack trace of thread 5771:
#0 0x0000000000000111 n/a (n/a + 0x0)
ELF object binary architecture: AMD x86-64
So, there are probably multiple use-after-free/double-delete bugs on Redkite side which don't show up with default gcc+libstdc++ build purely by coincedence, but make demons come out of your nose if you change stdlib or add more runtime validation via LIBCXX_DEBUG.
A quick google search suggests that "singular iterator" in the error message essentially means "uninitialized". But the
mainViewChanged
is generated from macro and doesn't really do anything fancy with iterators aside from for loop over observers, so it's not clear at all how can we get invalid iterator intoRkObjectImpl::observersList
unless it itself is in some invalid state. I hope you'll be able to figure it out...
I think I understand what is the problem. The observers list is updated during the loop iteration over in one of the the observer callback (usually Redkite was not yet designed with this... but in Geonkick there is a such code introduced by ViewState... which maybe is not so useful) ... and we get a problem with the iteration... there is a need to update the code in order to take in consideration these updates... i'll come later with a fix.
void prot \
{ \
for (const auto& ob: rk__observers()) { \
RK_LOG_DEBUG("observerCallback[get]: " << RkObject::name()); \
RK_LOG_DEBUG("observerCallback[get]: " << ob.get()); \
RK_LOG_DEBUG("observerCallback[get]: " << ob.get()->object()->name()); \
auto observer = dynamic_cast<rk__observer_ ##ObseverName *>(ob.get()); \
RK_LOG_DEBUG("object[call_0]..."); \
if (observer) { \
RK_LOG_DEBUG("object[call]..."); \
RK_LOG_DEBUG("object[call] : " << ob->object()->name()); \
observer->observerCallback(val); \
RK_LOG_DEBUG("object[end] : " << ob->object()->name()); \
} \
} \
this one can introduce new observers in the list observer->observerCallback(val);
@treapster Hi, I pushed the commit into the main branch https://github.com/Geonkick-Synthesizer/geonkick/commit/e76fb2d8527e430fa082fabf5e51e6c1295ba162 this should fix the problem.
@treapster Hi, I pushed the commit into the main branch e76fb2d this should fix the problem.
Thanks, it fixed the original problem for build with gcc, but as i said above there are also bugs with RkObject destruction which make the plugin unusable if compiled with clang and -stdlib=libc++
. I'd appreciate if you look into that too:).
Here is what happens:
* the destructor of `RkObject` runs and RkObject is destroyed
RkObject [1] is the interface, and has the member unique_ptr
* Then a single member of RkObject `o_ptr` is destroyed, it is of type `unique_ptr<RkObjectImpl>` * the destructor of o_ptr sets it's own value to zero(while copying it to temporary value), and calls the destructor of RkObjectImpl on the value
I don't think I understand what you mean.
* The destructor of RkObjectImpl references the parent RkObject, which is already destroyed,
No, the parent is not destroyed because it contains the children the one of which is being destroyed... the we are talking RkObject [1].
and tries to access it's o_ptr to get the parent, but it is alreay set to zero and segfaults.
The ~RkObjectImpl() accesses the interface though inf_ptr... inf_ptr->parent() { o_ptr->parent(); }... and if you mean that at this moment the unique o_ptr is reset to to zero? and is not accessible o_ptr->parent(); ?... Hm,... it would crash every time. I don't think. But I'll try to reproduce and see what is happening. Thanks for investigations.
The ~RkObjectImpl() accesses the interface though inf_ptr... inf_ptr->parent() { o_ptr->parent(); }... and if you mean that at this moment the unique o_ptr is reset to to zero?
Yes, i think where i said "parent" i meant interface, and the implementation object references inteface object which already was destroyed, and tries to find a pointer to itself which is already set to zero. I have no idea why it works with gcc standard lib, but with clang+libc++ it crashes every time as you said.
Here is what happens:
* the destructor of `RkObject` runs and RkObject is destroyed
RkObject [1] is the interface, and has the member unique_ptr... cannot be destroyed before unique_ptr.
The destructor of RkObject(and any object)can and will be called before destructors of it's members. So, the RkObject is destroyed(its destructor is run), and then the destructor of unique_ptr runs(which calls the destrucor of contained object)
* Then a single member of RkObject `o_ptr` is destroyed, it is of type `unique_ptr<RkObjectImpl>` * the destructor of o_ptr sets it's own value to zero(while copying it to temporary value), and calls the destructor of RkObjectImpl on the value
I don't think I understand what you mean.
When o_ptr is destroyed, it first sets itself to nullptr, and only then calls the destructor of contained RkObjectImpl. Or, to be more precise, it is what happens when you build with libc++. Here is implementation of relevant unique_ptr methods in libstdc++:
~unique_ptr() noexcept
{
static_assert(__is_invocable<deleter_type&, pointer>::value,
"unique_ptr's deleter must be invocable with a pointer");
auto& __ptr = _M_t._M_ptr();
if (__ptr != nullptr)
get_deleter()(std::move(__ptr));
__ptr = pointer(); // pointer set to zero after destruction
}
And here it is from libc++:
_LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR_SINCE_CXX23 ~unique_ptr() { reset(); }
_LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR_SINCE_CXX23 void reset(pointer __p = pointer()) _NOEXCEPT {
pointer __tmp = __ptr_.first();
__ptr_.first() = __p; // first set to zero
if (__tmp)
__ptr_.second()(__tmp); // then call deleter
}
So, unlike libstdc++, libc++ protects you from accesing the object via unique_ptr when it is in the process of destruction, and that is why it crashes every time with libc++ but works with libstdc++
@treapster These guys from KDE had the same problem - https://blogs.kde.org/comment/10377#comment-10377
I am quoting:
'That event filter will be triggered during destruction of child widgets (at least for the QEvent::Destroy event, I've also seen it with QEvent::Leave events for instance). And, depending on how the event filter is written, it might use the d pointer of the widget, possibly before checking the event type. That's where it gets interesting: the libc++ implementation of unique_ptr sets it to null before calling the destructor (because it's implemented in terms of reset(nullptr);. In libstdc++ however, unique_ptr's destructor just calls the destructor, its value remains valid during destruction.
So, ok, I need to find a workaround ... I'll think what to do in the case of pImpl pattern. Thanks for the clarification and finding this issue! But, sincerely, I don't like the implementation of the libc++ in this case... because the pointer needs to behave like a raw pointer when working with it... smart pointer to remain transparent.
I suspect that the following fix should be enough:
index c5b926c..ddcd264 100644
--- a/src/redkite/src/RkObjectImpl.cpp
+++ b/src/redkite/src/RkObjectImpl.cpp
@@ -52,8 +52,8 @@ RkObject::RkObjectImpl::~RkObjectImpl()
observersList.clear();
// Remove myself from the paren object.
- if (inf_ptr->parent())
- inf_ptr->parent()->removeChild(inf_ptr);
+ if (parentObject)
+ parentObject->removeChild(inf_ptr);
}
void RkObject::RkObjectImpl::removeChildrens()
But, sincerely, I don't like the implementation of the libc++ in this case... because the pointer needs to behave like a raw pointer when working with it... smart pointer to remain transparent.
I think libc++ is doing the right thing, because in the thread you linked they say that access to library objects during destruction is UB according to standard library, and in general it is grey zone whether doing something with partly-destroyed object is legal. In our case we had an actual UB, because the code in ~RkObjectImpl tried calling parent() method on inf_ptr of already destroyed interface(not mid-destruction!) , and libc++ caught this case indirectly through setting o_ptr to 0. We prevented UB, so it's a win for libc++:)
parentObject->removeChild(inf_ptr); yes, as exception for the destructor, but this puts limitations because I might need other logic from the interface not only the parent in the destructor... but ok... for now this is the only modification... the derived impl classes seems not to have additional logic in the destructors.
parentObject->removeChild(inf_ptr); yes, as exception for the destructor, but this puts limitations because I might need other logic from the interface not only the parent in the destructor... but ok... for now this is the only modification... the derived impl classes seems not to have additional logic in the destructors.
Since the interface is already destroyed at that point, we can't get anything from it in the destructor of impl. But as you said, we don't need it either, so it should be fine. I also wonder why redkite was designed that way - if we had an actual interface/abstract class and an implementation derived from it, the derived class could be sure that it's base is alive at all times, since the base is destroyed last. Current implementation kinda looks like it was rewritten from c.
@treapster "I also wonder why redkite was designed that way"... the story is long... the version 1.0 of Geonkick was developed in Qt, and it was only the standalone version.... than people asked for the plugin, and when the plugin was done than Qt for embedded UI is a problem. I needed for a small gui toolkit that met the requirements for the embedded GUI. There were some but... I was too lazy, and didn't want to change the code of Geonkick, and developed an interface almost compatible with Qt... with the same behavior. Thus, abstract interfaces would require to changed many things in Geonkick.
But yes, pimpl not actually needed even in this case... but I was thinking maybe Redkite would be used as shared library too and there is a need export the symbols (and keep the binary compatibility). But actually for the plugins is required only linked statically... and better maybe all in the header... no libredkite.a would be needed too... but I don't want to change now this library because it worked well until now.
What I am doing now here (https://github.com/Geonkick-Synthesizer/redkite) is an upgrade of Redkite, and instead of Widgets (that are system windows) to be only graphics widgets, and only one system window. This will not load the hosts with many new system windows when the plugin is embedded, but only one window... and the rest of the widgets are graphics, the interface will remain the same, and all stuff also related to pimpl. Probably this version of Redkite I hope to start use in Geonkick in the next releases... but there is some work to be done.
@iurienistor thanks for a detailed response, i didn't know about pimpl and how it allows to preserve ABI compatibility. And the qt reimplementation part is very interesting! BTW, did you know about JUCE when starting with geonkick, and if yes, why did you stick with qt?
but I don't want to change now this library because it worked well until now.
Now that we caught the bug, i hope it will continue working just as well. I erroneously said above that there may be multiple memory ussues, but it was just my incorrect fix which introduced use-after-free:). With the latest patch using parentObject, all segfaults are gone.
What I am doing now here (https://github.com/Geonkick-Synthesizer/redkite) is an upgrade of Redkite, and instead of Widgets (that are system windows) to be only graphics widgets, and only one system window. This will not load the hosts with many new system windows when the plugin is embedded, but only one window... and the rest of the widgets are graphics, the interface will remain the same, and all stuff also related to pimpl. Probably this version of Redkite I hope to start use in Geonkick in the next releases... but there is some work to be done.
Thank you for working on that and improving Redkite further!
And the qt reimplementation part is very interesting! BTW, did you know about JUCE when starting with geonkick, and if yes, why did you stick with qt?
Qt implementation of pimpl idiom has something that also prevents creating additional memory when there are inheritance, the same added in Redkite. Geonkick first was developed as standalone, never thought to make as plugin.... even more to know issues related to plugins that are GUI embedded and need to be self-sufficient... and Qt is not sutiable for this.
@treapster
With the latest patch using parentObject, all segfaults are gone.
I pushed a commit to the main branch - https://github.com/Geonkick-Synthesizer/geonkick/commit/1d20f0069c4bbd7213609a355a727af3172d9b20
Thanks for helping to find and fix this bug!
Latest commit from main. When pressing controls button, everything freezes and crashes. Build config:
cmake -DCMAKE_CXX_FLAGS="-D_GLIBCXX_DEBUG" -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++
Stack trace:
Demangled relevant part of the stacktrace:
A quick google search suggests that "singular iterator" in the error message essentially means "uninitialized". But the
mainViewChanged
is generated from macro and doesn't really do anything fancy with iterators aside from for loop over observers, so it's not clear at all how can we get invalid iterator intoRkObjectImpl::observersList
unless it itself is in some invalid state. I hope you'll be able to figure it out...Sidenote: consider building geonkick with
D_GLIBCXX_DEBUG
and using it for some time, maybe you'll find some more crashes before i do:)