eclipse-platform / eclipse.platform.swt

Eclipse SWT
https://www.eclipse.org/swt/
Eclipse Public License 2.0
118 stars 139 forks source link

SIGSEGV in `Tree.cellDataProc` when calling `TreeItem.setImage` #678

Open torokati44 opened 1 year ago

torokati44 commented 1 year ago

Describe the bug

Virtual Tree widget crashes JVM (through GTK3) when setting Image onto TreeItem in Listener to SWT.SetData. This is a continuation of https://bugs.eclipse.org/bugs/show_bug.cgi?id=573090.

---------------  T H R E A D  ---------------

Current thread (0x00007ffa500284a0):  JavaThread "main" [_thread_in_native, id=112372, stack(0x00007ffa57900000,0x00007ffa57a00000)]

Stack: [0x00007ffa57900000,0x00007ffa57a00000],  sp=0x00007ffa579fc318,  free space=1008k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libgobject-2.0.so.0+0x3b251]  g_type_check_instance_is_fundamentally_a+0x11
C  [libswt-pi3-gtk-4958r2.so+0x4b609]  Java_org_eclipse_swt_internal_gtk_OS_g_1object_1set__J_3BJJ+0x4a

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 11988  org.eclipse.swt.internal.gtk.OS.g_object_set(J[BJJ)V (0 bytes) @ 0x00007ffa491c5883 [0x00007ffa491c5820+0x0000000000000063]
J 10921 c1 org.eclipse.swt.widgets.Tree.cellDataProc(JJJJJ)J (486 bytes) @ 0x00007ffa419b8bf4 [0x00007ffa419b81e0+0x0000000000000a14]
J 10920 c1 org.eclipse.swt.widgets.Display.cellDataProc(JJJJJ)J (29 bytes) @ 0x00007ffa41f4ac5c [0x00007ffa41f4aa60+0x00000000000001fc]
v  ~StubRoutines::call_stub
J 11619  org.eclipse.swt.internal.gtk3.GTK3.gtk_main_iteration_do(Z)Z (0 bytes) @ 0x00007ffa49380e69 [0x00007ffa49380e20+0x0000000000000049]
J 11623 c1 org.eclipse.swt.widgets.Display.readAndDispatch()Z (88 bytes) @ 0x00007ffa423e622c [0x00007ffa423e6060+0x00000000000001cc]

To Reproduce

package swt_tree_crash;

import org.eclipse.swt.SWT;
import org.eclipse.swt.graphics.Image;
import org.eclipse.swt.layout.FillLayout;
import org.eclipse.swt.widgets.Display;
import org.eclipse.swt.widgets.Event;
import org.eclipse.swt.widgets.Listener;
import org.eclipse.swt.widgets.Shell;
import org.eclipse.swt.widgets.Tree;
import org.eclipse.swt.widgets.TreeItem;

public class Main {
    public static void main(String[] args) {
        Display display = new Display();
        Shell shell = new Shell(display);
        shell.setLayout(new FillLayout());

        Tree tree = new Tree(shell, SWT.VIRTUAL);
        tree.addListener(SWT.SetData, new Listener() {
            public void handleEvent(final Event e) {
                TreeItem item = (TreeItem)e.item;
                item.setText(0, "A");
                item.setImage(new Image(display, 20, 20)); // <-- this is the critical line!
            }
        });
        tree.setItemCount(1);

        shell.setSize(400, 300);
        shell.open();

        while (!shell.isDisposed()) {
            if (!display.readAndDispatch()) {
                display.sleep();
            }
        }
        display.dispose();
    }
}

Expected behavior

The application always shows a window with a tree widget in it, containing one item, with a small white icon.

Screenshots

Not applicable.

Environment:

  1. Select the platform(s) on which the behavior is seen:

      • [ ] All OS
      • [ ] Windows
      • [X] Linux
      • [ ] macOS
  2. Additional OS info (e.g. OS version, Linux Desktop, etc)

Operating System: Fedora Linux 38 KDE Plasma Version: 5.27.4 KDE Frameworks Version: 5.105.0 Qt Version: 5.15.9 Kernel Version: 6.2.15-300.fc38.x86_64 (64-bit) Graphics Platform: Wayland Processors: 16 × AMD Ryzen 7 3800X 8-Core Processor Memory: 31.2 GiB of RAM Graphics Processor: AMD Radeon RX 550 Series

org.eclipse.swt.internal.gtk.version=3.24.37

  1. JRE/JDK version

JRE version: OpenJDK Runtime Environment Temurin-17.0.7+7 (17.0.7+7) (build 17.0.7+7) Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.7+7 (17.0.7+7, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)

Version since

Tested on Eclipse 4.27 (SWT 4.958).

Workaround (or) Additional context

The crash doesn't always happen, but fairly often. Removing the TreeItem.setImage call avoids the crash entirely.

SyntevoAlex commented 1 year ago

Sorry about the waiting time, SWT currently has few active maintainers. I'm generally interested in JVM crashes, and plan to have a look at this eventually. Unfortunately the waiting time could be around a month.

torokati44 commented 1 year ago

Hello, just a ping on this issue: We are still often encountering this crash on an almost daily basis... :/

SyntevoAlex commented 1 year ago

SWT is currently low on active contributors. It would be nice if you can debug the problem yourself and prepare a fix. Otherwise, alas, you'll have to wait more, and there's no ETA.

torokati44 commented 1 year ago

Otherwise, alas, you'll have to wait more, and there's no ETA.

Of course, I understand, FOSS is what it is...

It would be nice if you can debug the problem yourself and prepare a fix.

Actually we ended up tossing the problematic pieces of code (in this one and #697) into a Display.asyncExec callback, so they can be done at a time when Gtk is hopefully no longer confused. It's dumb, but appears to be working so far.

Mailaender commented 1 year ago

I found a way to reproduce all the time in a @flatpak runtime, where also all workarounds fail.

SyntevoAlex commented 12 months ago

It would be useful if you describe how exactly do you reproduce.

Mailaender commented 12 months ago

Sorry, I thought that it won't be that helpful because you need to build a package to test.

Here it is https://github.com/Mailaender/flathub/compare/chemclipse Check the README for the commands to build it.

Click Demo on the green intro screen to trigger the crash.

Use

runtime-version: 20.08

and rebuild to see how the application looked before the crash was introduced.

SyntevoAlex commented 12 months ago

Oh, I didn't expect a whole other application. Indeed I don't have the time to debug it. Can you please post your hs_err_pid.log crash reports?

Mailaender commented 12 months ago

https://gist.github.com/Mailaender/24f56492ded2586edfecf2e4df6d5232

Mailaender commented 11 months ago

You can also indirectly trigger this with

public class MyLabelProvider extends LabelProvider implements ILabelProvider {

    @Override
    public Image getImage(Object element) {
        // return something other then null
    }
}

This is sometimes difficult to reproduce. I had my best chances when there was only one unexpanded item in the tree viewer.

Mailaender commented 9 months ago

I can't reproduce this in a development environment so this might be already accidentally fixed?

grafik

torokati44 commented 9 months ago

We still have the Display.asyncExec workaround in place, so I can't comment on it, as we haven't encountered it since. It's possible it got fixed, but I can't verify it due to lack of time, sorry.

Mailaender commented 9 months ago

There are occasions where the Display.asyncExec either does not work or only lessens the problem.

I just copied your snippet into my 2023-03 project and it crashed. It does not when I try it here.

The instructions at https://www.eclipse.org/swt/git.php are somehow outdated or not very precise, but this should work:

Install git-lfs. Git clone this repository. Only import the repositories (plus snippets) and rename or copy

I just ignored the API baseline error and got a test environment where the problem is not reproducible or hopefully already fixed.

Mailaender commented 9 months ago

Updated consecutively up to Eclipse 2023-12 and the crash is still there.

Mailaender commented 9 months ago

While trying to work around it, I found another way to crash more reliable:

public class MyLabelProvider extends LabelProvider implements ILabelProvider {

    @Override
    public Image getImage(Object element) {
        try {
            Thread.sleep(100);
        } catch(InterruptedException e) {
            logger.warn(e);
        }
        new Image(display, 16, 16))
    }
}
Mailaender commented 8 months ago

I have no real fix yet, and the TreeViewer API does not allow for a simple override of the setImage function. However, I found out that only the first setImage will crash, all subsequent won't. So setImage(null) will make all following calls safe.

Mailaender commented 8 months ago

I also don't understand why this crash does not happen with the Eclipse IDE. The PDE Project Explorer and Java Package Explorer as well as the Git Repositories all use trees with icons.

iloveeclipse commented 8 months ago

Just an observation: running original snippet from the ticket description I see no crash but following GTK errors on GTK 3.22 / RHEL 7.9

(SWT:67615): GLib-GObject-CRITICAL **: 14:43:43.462: g_object_notify_queue_thaw: assertion 'nqueue->freeze_count > 0' failed

(SWT:67615): GLib-CRITICAL **: 14:43:43.462: g_hash_table_foreach: assertion 'version == hash_table->version' failed

and following on GTK 3.24 / RHEL 9.2:

(SWT:1512449): GLib-GObject-CRITICAL **: 14:47:50.210: g_object_set: assertion 'G_IS_OBJECT (object)' failed

(SWT:1512449): GLib-GObject-WARNING **: 14:47:50.210: invalid unclassed pointer in cast to 'GObject'

(SWT:1512449): GLib-GObject-CRITICAL **: 14:47:50.210: g_object_thaw_notify: assertion 'G_IS_OBJECT (object)' failed

(SWT:1512449): GLib-CRITICAL **: 14:47:50.210: g_hash_table_foreach: assertion 'version == hash_table->version' failed
Mailaender commented 2 months ago

However, I found out that only the first setImage will crash, all subsequent won't. So setImage(null) will make all following calls safe.

Let's say it makes most calls safe. My workaround only reduces the occurrence of the crash.

Mailaender commented 2 months ago

I think this was introduced in GTK version jump between Ubuntu 22.04 → Ubuntu 24.04.

akurtakov commented 2 months ago

I think this was introduced in GTK version jump between Ubuntu 22.04 → Ubuntu 24.04.

You mean it doesn't crash on 22.04 but crashes on 24.04 ?

Mailaender commented 2 months ago

I updated our Eclipse RCPTT UI testing server recently and it also started randomly crashing. However, it does not preserve the stack trace and is not reproducible between runs.

Mailaender commented 2 months ago

OpenJDK from Ubuntu or Temurin from Eclipse, JVM does not seem to matter here.

Mailaender commented 2 months ago

I think the difference between Eclipse IDE and Eclipse ChemClipse is that the Data Explorer (file viewer) halts the whole application while it waits for I/O related things. It either crashes right at the beginning when folders are traversed or not at all, as in it never crashes except during startup. The Eclipse IDE has a way more responsive UI while it is still loading things.

basilevs commented 5 days ago

Bug678.txt reproduces the failure very reliably on my Ubuntu 24 instance.

The key is to repeat the test and take into account background event processing:

@Test
// https://github.com/eclipse-platform/eclipse.platform.swt/issues/678
public void eclipse() {
    Shell shell = new Shell();
    waitForInactivity();
    try {
        Image image = new Image(getDisplay(), 20, 20);
        try {
            shell.setSize(200, 200);
            shell.setLayout(new FillLayout());
            shell.open();
            for (int i = 0; i < 100; i++) {
                Tree tree = new Tree(shell, SWT.VIRTUAL);
                tree.addListener(SWT.SetData, new Listener() {
                    public void handleEvent(final Event e) {
                        TreeItem item = (TreeItem) e.item;
                        item.setText(0, "A");
                        item.setImage(image); // <-- this is the critical line!
                    }
                });
                tree.setItemCount(1);

                waitForInactivity();
                tree.dispose();
            }
        } finally {
            image.dispose();
        }
    } finally {
        shell.dispose();
    }
}

See definition for waitForInactivity() in the attached file. The failure is reproduced both with and without GDK_BACKEND=x11 environment variable. hs_err.txt

basilevs commented 5 days ago

I've prepared a reproducer that can be started on any Docker capable host (architecture and OS do not matter).

Prerequisites

Docker is installed and ready to use

Steps to reproduce

Expected result

test-1 exited with code 0

Actual result

test-1  | qemu: uncaught target signal 6 (Aborted) - core dumped
test-1  | Aborted
test-1 exited with code 134

To try another version of SWT, replace JAR files in bug_678/swt directory (file names are ignored, so it is safe to use any version)

Environments

Ubuntu 22 (Jammy) is not affected Ubuntu 24 (Noble) is affected Java versions 17, 21, 23 are affected

Workaround

e.display.asyncExec(() -> {
    item.setImage(image); 
});
basilevs commented 1 day ago

The problem goes away if the following line in TreeItem.setImage():

                parent.createRenderers(column, modelIndex, check, parent.style);

is replaced with

                getDisplay().asyncExec(() -> {
                    parent.createRenderers(column, modelIndex, check, parent.style);
                });
basilevs commented 1 day ago

@ericwill why does TreeItem.setImage() recreate renderer for a single column? Should not all columns be affected?

basilevs commented 1 day ago

@SyntevoAlex could you please suggest a way to detect GTK warnings from Junit tests?

GTK has been desperately trying to warn us about this problem, but we always ignore its printout to standard error:

My Application Name:4909): GLib-CRITICAL **: 14:43:25.864: g_hash_table_foreach: assertion 'version == hash_table->version' failed

(My Application Name:4909): GLib-GObject-WARNING **: 14:43:25.897: g_object_notify_queue_thaw: property-changed notification for GtkCellRendererText(0x7f06dded6990) is not frozen
ericwill commented 1 day ago

@ericwill why does TreeItem.setImage() recreate renderer for a single column? Should not all columns be affected?

You're asking me about code I haven't touched in ~4 years, so I might be wrong. If memory serves correctly the method in question modifies only one column at a time as per SWT API. Also the method in question involves SWT.VIRTUAL trees which are performance sensitive (lazy loaded) so we are not looking to make broad changes as it could incur a performance hit.

basilevs commented 1 day ago

You're asking me about code I haven't touched in ~4 years, so I might be wrong. If memory serves correctly the method in question modifies only one column at a time as per SWT API. Also the method in question involves SWT.VIRTUAL trees which are performance sensitive (lazy loaded) so we are not looking to make broad changes as it could incur a performance hit.

It was trying to solve cropping of images by using actual size in gtk_cell_renderer_set_fixed_size(renderer, x, y) call, however, it uses Tree-global values, so was worried about other columns, where this call does not happen and leaves renderer configuration unsynchronized with global setting, potentially leaving invalid cropping for all but one column.

This is fine, neither of us probably have time to write a test anyway 🤷