hydrogen-music / hydrogen

The advanced drum machine for Linux, macOS, and Windows
http://www.hydrogen-music.org
GNU General Public License v2.0
1.07k stars 172 forks source link

Fix crash due to undefined order of static initialisers #2019

Closed cme closed 2 months ago

cme commented 3 months ago

Filesystem::log_file_path relies on a static-initialised value set in Filesystem::__usr_log_path. We should avoid calling this from static initializers because the order of execution of static initialisers is undefined.

Alternatively (or additionally) removing the dependency on static initializers in Filesystem::log_file_path would be benificial but ordering dependencies there are trickier

cme commented 3 months ago

Hmm, there seem to be more dependencies here. Failing to load resources on macOS.

theGreatWhiteShark commented 3 months ago

Failing to load resources on macOS.

And we did not catch it in the test pipeline since we only execute the test binary and not hydrogen itself?

Hmm... Well, we have OSC command QUIT that could be used to terminate an instance started in the pipeline. But that would require a second thread.

I could also live with a "hidden" CLI option to initiate teardown as soon as hydrogen managed to start up which is only available in debug builds.

But we should ensure the pipeline catches critical failures during startup.

cme commented 3 months ago

Hmm, since 7f30f36e5c21ed8f7b74470d761047ca6255928c we fail to find resources (on macOS and on Linux for me) because during Filesystem::bootstrap there's no QCoreApplication instance so QCoreApplication::applicationDirPath() returns an empty string. I think that's why we had a temporary QCoreApplication instance in main() previously. Complex and annoying dependencies, but I'm curious how it's worked for you since then @theGreatWhiteShark ?

theGreatWhiteShark commented 3 months ago

but I'm curious how it's worked for you since then @theGreatWhiteShark ?

Excellent question. I have no idea. It just works like a charm. But I just built an AppImage locally and using this I can reproduce it.

I also double checked the Parser implementation as well as a couple of other things later using the Windows build on a local machine. I don't know. Maybe different versions of Qt's runtime?

I think that's why we had a temporary QCoreApplication instance in main() previously.

I see. How about we do not make Parser::parse delete but return the QCoreApplication and let the calling routine handle its deletion?

cme commented 3 months ago

I think that's why we had a temporary QCoreApplication instance in main() previously.

I see. How about we do not make Parser::parse delete but return the QCoreApplication and let the calling routine handle its deletion?

Or perhaps better -- use QCoreApplication::instance() to access an existing instance to pass to the QCommandLineParser::process() call, and maybe optionally constructing a temporary one (as it currently does) if there's no current instance to use?

theGreatWhiteShark commented 3 months ago

Or perhaps better -- use QCoreApplication::instance() to access an existing instance to pass to the QCommandLineParser::process() call, and maybe optionally constructing a temporary one (as it currently does) if there's no current instance to use?

Sounds good

theGreatWhiteShark commented 3 months ago

Works fine on Linux.

I also tested in on Windows but there Reporter does not work properly (but not related to this PR).

Although handleFatalSignal in main.cpp is called with signal 11, in Reporter::on_finished m_pChild->exitStatus() reports QProcess::NormalExit. On Windows we might hand the signal over to Reporter::report() and show the popup on non-zero values regardless of the exit status.

Was there a reason for the check of exit status in Reporter::report(). It seems it only gets triggered within handleFatalSignal. Shouldn't this guarantee the presence of a crash anyway?

cme commented 2 months ago

Although handleFatalSignal in main.cpp is called with signal 11, in Reporter::on_finished m_pChild->exitStatus() reports QProcess::NormalExit. On Windows we might hand the signal over to Reporter::report() and show the popup on non-zero values regardless of the exit status.

Is that on Linux? Hm. It should report an abnormal failure because handleFatalSignal in the child re-raise()s the signal. Odd. It certainly works on macOS anyway. I'll try provoking some crashes on Linux to see why it returns as normal to the parent.

Was there a reason for the check of exit status in Reporter::report(). It seems it only gets triggered within handleFatalSignal. Shouldn't this guarantee the presence of a crash anyway?

Do you mean in Reporter::on_finished()? That should be executed in the parent process, on normal and abnormal termination, so needs to be silent if the child exits normally.

theGreatWhiteShark commented 2 months ago

Do you mean in Reporter::on_finished()?

Yes. You're right.

Is that on Linux? Hm. It should report an abnormal failure because handleFatalSignal in the child re-raise()s the signal. Odd. It certainly works on macOS anyway. I'll try provoking some crashes on Linux to see why it returns as normal to the parent.

Nope. Linux works just fine. It happens on Windows 10.

That should be executed in the parent process, on normal and abnormal termination, so needs to be silent if the child exits normally.

I see. Could we make main::handleFatalSignal set a flag in Reporter which overwrites a seemingly normal exit? Or would that cause problems on other platforms? Because for some reason the crashes are indicated by Qt as normal exits on my local Windows machine causing the Reporter to not work.

cme commented 2 months ago

Nope. Linux works just fine. It happens on Windows 10.

Aaaah. Windows. Right.

That should be executed in the parent process, on normal and abnormal termination, so needs to be silent if the child exits normally.

I see. Could we make main::handleFatalSignal set a flag in Reporter which overwrites a seemingly normal exit?

I guess the only communication between the parent and child other than the exit status is the stdout. Given that the parent already monitors the output for a crash context identifier, maybe it's reasonable to indicate a specific failure or normal exit on stdout and ignore the exit status. But first I'll have a dig into what happens on Windows.

theGreatWhiteShark commented 2 months ago

But first I'll have a dig into what happens on Windows.

But take your time. I would not consider this blocking for 1.2.4

cme commented 2 months ago

Simple answer (#2031) is: check the good old-fashioned return code as well. :D