flathub / org.darktable.Darktable

https://flathub.org/apps/details/org.darktable.Darktable
5 stars 14 forks source link

Darktable crashes and then fails to start again #54

Open Hofer-Julian opened 4 years ago

Hofer-Julian commented 4 years ago

I've noticed that recently darktable sometime crashes and then fails to start with: grafik

Only a removal of /home/user/.var/app/org.darktable.Darktable lets it run again.

When I only delete the lock files in /home/user/.var/app/org.darktable.Darktableand start darktable with the terminal it outputs:

an error occurred while trying to execute gdb. please check if gdb is installed on your system.
backtrace written to /tmp/darktable_bt_8PE3I0.txt
hfiguiere commented 4 years ago

Could you provide that backtrace?

Hofer-Julian commented 4 years ago

I just reproduced this error and searched my whole system after finding nothing in my /tmp dir, but unfortunately this txt file does not exist...

Hofer-Julian commented 4 years ago

I forgot to mention, that I run: Fedora 31 GNOME Flatpak 1.4.4 darktable 3.0.1

Hofer-Julian commented 4 years ago

@hfiguiere I've been able to reproduce the original crash which leaves me with a locked state.

It happens every time I go to nautilus and use open-with to open a RAW file with darktable.

Hofer-Julian commented 4 years ago

@hfiguiere I've been able to reproduce the original crash which leaves me with a locked state.

It happens every time I go to nautilus and use open-with to open a RAW file with darktable.

It does not seem to happen every time, but after at least opening three files this way it crashes

paperdigits commented 4 years ago

Is darktable already open when you use open with from the file manager?

On April 12, 2020 12:35:48 PM PDT, Julian notifications@github.com wrote:

@hfiguiere I've been able to reproduce the original crash which leaves me with a locked state.

It happens every time I go to nautilus and use open-with to open a RAW file with darktable.

It does not seem to happen every time, but after at least opening three files this way it crashes

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/flathub/org.darktable.Darktable/issues/54#issuecomment-612665334

Hofer-Julian commented 4 years ago

Is darktable already open when you use open with from the file manager?

@paperdigits, no it is not.

Hofer-Julian commented 4 years ago

I've just noticed that the crash from opening a single file is reproducible with the non-flatpak version of fedora (3.0.1) (e.g. by opening a file with the CLI, like "darktable IMG_1360_QP3MHiz.CR2"). But afterwards it doesn't leave me with a locked database.

hfiguiere commented 4 years ago

What's the upstream bug # ?

Hofer-Julian commented 4 years ago

There is none yet, because I thought it was a specific flatpak bug (which it still seems to be to some degree)

Hofer-Julian commented 4 years ago

I found out something new. The database is also locked for me when I kill darktable e.g. with the gnome system monitor. This should be easy to reproduce and doesn't happen with the non-flatpak package from fedora.

@paperdigits and @hfiguiere could you please try this on your machines:

  1. Start darktable
  2. End darktable process
  3. Try to start darktable again
paperdigits commented 4 years ago

My desktop environment is Plasma, so I killed darktable from Plasma's System Monitor. On reopen, I get the bug as well, both library and data dbs have a lock file.

On April 13, 2020 1:46:46 PM PDT, Julian notifications@github.com wrote:

I found out something new. The database is also locked for me when I kill darktable e.g. with the gnome system monitor. This should be easy to reproduce and doesn't happen with the non-flatpak package from fedora.

@paperdigits and @hfiguiere could you please try this on your machines:

  1. Start darktable
  2. End darktable process
  3. Try to start darktable again

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/flathub/org.darktable.Darktable/issues/54#issuecomment-613089883

Hofer-Julian commented 4 years ago

Okay, this makes it clear that I encountered two bugs here:

  1. darktable seems to sometimes crash when started with "open with" or with the cli + filename -> I'll have to properly reproduce this error
  2. darktable-flatpak seems to be left in a locked state if it's process is being ended outside the proper window exit. -> this is quite a major bug, but my darktable knowledge is not big enough to guess where the problem might be.

@paperdigits or @hfiguiere, do you have an idea? Or alternatively, do you know who to ask?

paperdigits commented 4 years ago

Is the task manager sending sigterm or sigkill... Or something else? Is flatpak interpreting the kill command as something else?

I asked in IRC, but didn't get a great answer.

On April 14, 2020 8:38:02 AM PDT, Julian notifications@github.com wrote:

Okay, this makes it clear that I encountered two bugs here:

  1. darktable seems to sometimes crash when started with "open with" or with the cli + filename -> I'll have to properly reproduce this error
  2. darktable-flatpak seems to be left in a locked state if it's process is being ended outside the proper window exit. -> this is quite a major bug, but my darktable knowledge is not big enough to guess where the problem might be.

@paperdigits or @hfiguiere, do you have an idea? Or alternatively, do you know who to ask?

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/flathub/org.darktable.Darktable/issues/54#issuecomment-613515665

Hofer-Julian commented 4 years ago

Is the task manager sending sigterm or sigkill... Or something else? Is flatpak interpreting the kill command as something else? I asked in IRC, but didn't get a great answer.

I don't think it is related to sigterm or sigkill, as this bug also occurs also when darktable crashes :)

However, in the meantime I made progress as well. Running flatpak run --command=sh org.darktable.Darktable and inside this shell running top revealed that inside a flatpak you don't see the other processes of your PC. This explains the small PID of 2 in the first picture I've uploaded.

I then searched for the error message in darktable's source code and found this: https://github.com/darktable-org/darktable/blob/07308e94cfaad6bc6d6d3a98bb26f330389e73bb/src/common/database.c#L2161

As far as I can see the start of darktable works like this:

  1. Is there a lock file? No? -> Starting Yes? -> 2.

  2. Is the pid mentioned in the lock file still alive? No? -> Starting Yes? -> Refuse to start and pop up the dialog.

In theory, pid_is_alive would check if the pid is a darktable instance, but this seems to fail inside the flatpak.

I guess more could be found out by:

  1. ~Recompiling the flatpak with gdb (at least I couldn't find gdb in the flatpak)~ I just found http://docs.flatpak.org/en/latest/debugging.html
  2. Putting a breakpoint here https://github.com/darktable-org/darktable/blob/07308e94cfaad6bc6d6d3a98bb26f330389e73bb/src/common/database.c#L2080
  3. Checking why the program doesn't see that this pid is not a darktable instance.
hfiguiere commented 4 years ago

I am seeing the crash and the problem.

rrenomeron commented 3 years ago

I've run into this issue whenever the darktable flatpak crashes. I think it's a quirk of the way flatpak works, and I'm not sure there's a way to fix it without modifying darktable itself.

IIRC, flatpak sets up a pid namespace so that the sandboxed darktable always thinks it's PID 2. So when you start it up again after a crash, it sees that PID 2 "owns" the lock, and when it finds out that PID 2 exists, it complains. In an unsandboxed darktable, you're unlikely to run into this unless the crashed darktable process is a zombie or is not yet killed.

How to fix this is a tough problem. I don't think checking if the pid corresponds to a darktable instance will work -- because technically, pid 2 is a darktable instance. Revising the logic from "check if the pid is alive" to "check if the pid is alive and different from my pid" might work, but I think the implications of doing that need to be thought through to avoid the risk of messing up the database.