Closed Mark-64 closed 3 years ago
so what I am doing is not permitted, or ?
Lua is supposed to be used in lighttable only, with a couple of exceptions. You can draw a guide in the crop and rotate module, take a snapshot, query the image currently selected, and select another image. I'm not sure how the exporter being visible in darkroom mode is going to work with Lua. I was just getting ready to check when I saw this. I have a feeling it will be OK, because it's wrapped in the exporter.
The backtrace appears to show that darkroom couldn't access the module so perhaps adding a lib to darkroom doesn't work correctly. I'm guessing it was never tested since it isn't supposed to happen.
I tried this in Linux and it worked. I created a jpg from a raw, edited it in gimp, overwrote the image with the result, and it worked fine. Also edited a copy and it imported fine. Tried the shortcut and the lib.
Yes, sometimes it works in Windows too, but many times crashes when doing "edit a copy" from the lib gui in darkroom. Never happens in lighttable.
With the debugger, it is evident that at certain point _display_module_trouble_message_callback()
is called with an unitialized value of module
, 100% reproducible, but it doesn't always produces a crash and as always Linux is more forgiving in those cases. I will try under Linux too.
If you want to test, you can use this simplified script that creates one lib in both lighttable and darkroom with only one buttom "copy and import", which the failing action.
test.lua.txt
Tried in Linux and there is no issue, pretty weird.
Here's the policy on lua
The general limitation of the lua API is the lighttable view. In other word we aim at being able to do with lua everything that can be done in the lighttable, but anything that can only be done in darkroom mode is off-limit
Yes, I remember I read this already somewhere. In reality the action: <copying an image + importing the copy> is a typical lighttable action. The question is: why this fails if it is performed while being in darkroom and why only on Windows? Also it fails in a function which has been added recently if I am not mistaken, so it could be something trivial after all. I will have to investigate more.
I tried it on win7 with 3.4.0 and got no crashes. Did 10 copy and imports.
You could try 3.4.0, or 3.4.1.1 and see if it crashes. If not, do a git bisect to figure out what commit caused the problem.
No problems until 3.4.1.1. It is something done later, I will try to bisect
This issue starts as a consequence of #7966 which introduced modules trouble messages based on signals. I believe it is some sort of race condition between generation of filmstrip thumbnail (triggered by lua importing an image while in darkroom) and refresh of the center image. In fact, if I introduce a delay between the two in my lua script there is no more crash, see below
-- refresh darkroom view
if dt.gui.current_view() == dt.gui.views.darkroom then
dt.control.sleep(1500)
dt.gui.views.darkroom.display_image(new_image)
end
That would explain the weird random behaviour and why it is different under Linux.
I guess this is a case unforeseen, we are a bit border line here. @TurboGit can I ask your opinion ?
@Mark-64 does the problem occur if the module isn't visible in darkroom view and you only use the shortcut?
does the problem occur if the module isn't visible in darkroom view and you only use the shortcut?
Current script has shortcuts for the "edit" action and that is fine. However if I modify the script to associate a shortcut to "edit a copy" action, then yes, the problem occurs even if the module is not visible in darkroom. So it has nothing to do with the visibility of the module, it is related to importing a new image while in darkroom. So I remain of my idea.
I might have found the solution here, but I need the opinion of an expert dev.
_display_module_trouble_message_callback()
, where the segfault occurs, is triggered by a signal raised by dt_iop_set_module_trouble_message()
.
In turns, that can be called by
_display_wb_error() in temperature.c
_check_for_wb_issue_and_set_trouble_message() in channelmixerrgb.c
dt_iop_have_required_input_format() in imageop.c
Now, the first function has the semaphores ++darktable.gui->reset
and --darktable.gui->reset
at the beginnind and at the end, while the other two don't have.
Putting those, there is no more fault. Maybe doing this we avoid a asynch signals causing the segfault?
@wpferguson what do you think? @TurboGit you implemented the module trouble signals, I really appreciate your opinion here
@Mark-64 I had another thought (actually another test). Enable contrib/gimp, open an image in darkroom mode, then use the exporter and select "edit with GIMP", then click export. This does an export of the image, opens it in GIMP, then on close imports the image and groups it with the original, but it doesn't update the display. You could add a line, after the import, to have it do that. That way we would have a separate code path that triggers the bug.
I'll look at the code you pointed out above and see what I can figure out.
I think I might have a scenario, though I'm not sure why it doesn't happen in linux. The image you import is always an ldr (jpg, tif, i.e. non-raw). In lighttable when you import an image it's checked when generating the thumbnail and the image flags are set, which in this case identify the image as not raw so therefore no white balance will be attempted. When you import in darkroom and immediately display, the image (sometimes) hasn't been identified as a not raw, so darktable attempts to apply a white balance and then throws an error because it can't. And maybe it doesn't occur on linux because linux processes it just a bit faster, thus avoiding the problem.
EDIT: However, segfaulting when a trouble message occurs is a problem that still needs addressed. @Mark-64 I wonder if you took your test script from above and made it loop 5 or 10 times if it would cause the error on linux?
Now, the first function has the semaphores ++darktable.gui->reset and --darktable.gui->reset at the beginnind and at the end, while the other two don't have.
Why don't you put a print statement in those three functions to see which one is actually raising the trouble message? Run it until it crashes and see which one printed.
Putting those, there is no more fault. Maybe doing this we avoid a asynch signals causing the segfault?
Did you get:
no error message and crash before refresh of the central image
If you wrapped the trouble message with the ++|-- darktable.gui->reset what happened?
I tried the loop in linux. Still no crash though I did get
[dt_ioppr_transform_image_colorspace_cl] invalid conversion from 2 to 0
[_transform_matrix] invalid conversion from 2 to 0
[dt_ioppr_transform_image_colorspace] invalid conversion from 2 to 0
each time I duplicated an image and imported.
the fault happens (when it happens) in _display_module_trouble_message_callback()
, with a signal raised by dt_iop_set_module_trouble_message()
, called by dt_iop_have_required_input_format()
, called by module "gamma". No need to put print, I see from debugger.
Also dt_iop_have_required_input_format()
is often executed by another thread than the gui main thread.
Now, dt_iop_have_required_input_format()
has a buffer copy inside dt_iop_copy_image_roi()
. I suspect that this operation is long enough that, without semaphores, there is a significant chance to get another signal raised in the meantime
If you wrapped the trouble message with the ++|-- darktable.gui->reset what happened?
I get no errors in the console. I can do a PR for testing with the modification, give me a few minutes
I will do the other tests you suggested later
Are you running darktable from the build tree, or generating an installer version and installing it? If you run it under the mingw shell, lua behaves differently, since the shell is actually a bash shell and not a windows shell.
I run within my Eclipse IDE, in order to have the debugger, but I get the crash also with my regular installed version
Here's the issues I see...
The gamma module (display encoding) is somehow getting bad data. You can cause it (sometimes) by importing an image and displaying it while in darkroom mode on win 10. We haven't been able to recreate on linux. I'll try building and recreating on win 7.
When the gamma module uses dt_iop_set_module_trouble_message() and the signal is raised the callback, _display_module_trouble_message_callback(), segfaults at darkroom.c, line 224. Are one of the tests undefined?
I had another thought (actually another test). Enable contrib/gimp, open an image in darkroom mode, then use the exporter and select "edit with GIMP", then click export. This does an export of the image, opens it in GIMP, then on close imports the image and groups it with the original, but it doesn't update the display. You could add a line, after the import, to have it do that. That way we would have a separate code path that triggers the bug.
segfault also in this scenario (without PR)
I just did 29 copy and imports on the latest git on win7 with no crash. It looks like this is a Win10 "feature".
Yes and unfortunately, to more extensive tests, the PR does not solve the issue :-(
Ok, so maybe a race condition indeed. It seems that you said this issue to be related on signals and also talking about Lua. Have you tried removing luarc completely? Can you still reproduce?
Sorry if you already tried, I may have missed a message.
Can you test setting plugins/darkroom/export/visible
in darktablerc to false
?
Can you test #8273 ?
In my case I get the crash with both #8273 and plugins/darkroom/export/visible
set to false
.
In my scenario I am already in darkroom and importing an image, which I do with a lua shortcut (which forces a filmstrip refresh), triggers the issue.
It is intermittent but it's there. I can't reproduce like @chhil in #8262 by the way, but it is obviously the same thing. Shall we continue the discussione there?
closed duplicate
@Mark-64 Not related to the issue. Do you have some eclipse setup guidelines on windows for debugging that you could share?
I had setup NetBeans long time ago but got rid of it as it just took too long to start and debug. Eclipse I am familiar with as its my primary ide for day job in java, so I any guidelines setting it up would be greatly appreciated.
@chhil Sorry I have completely missed your message. Some time ago I wrote a guide for darktable in Eclipse. I will try to update it and share. It will take a day or two.
@chhil see https://github.com/Mark-64/darktable-eclipse plase let me know if the guide works or if there is anything missing / unclear
With the recent introduction of export lib in darkroom, I thought to modify my ext_editor.lua script so to have the possibility to do round trips to external programs with .tiff files while in darkroom, by using the gui (this is already possible with the current script through shortcuts).
So I modified the script this way, by adding the darkroom line
In essence, the failing branch of the script creates a physical copy of the image, calls the external program on it and import the result in the database. While this works perfectly if called from lighttable, it creates a segfault in darkroom. Here is the backtrace. darktable_bt_DESMY0.txt
With a debugger, I can see that function
_display_module_trouble_message_callback()
is called with an unitialized value ofmodule
, which meansDT_SIGNAL_TROUBLE_MESSAGE
is raised incorrectly somewhere.Perhaps importing pics in database is not allowed while in darkroom, so what I am doing is not permitted, or ? @wpferguson can you have a look ?