Kl4rry / simp

🖼️ Simp is a fast and simple GPU-accelerated image manipulation program.
Apache License 2.0
307 stars 14 forks source link

v3.6.1 coredumps on close #34

Closed 0323pin closed 5 months ago

0323pin commented 5 months ago

Hi,

Aiming to update the NetBSD package, I did a build of simp-3.6.1 from source within pkgsrc framework. The resulting binary coredumps when closing the application. This is consistent and happens every time. gdb backtrace doesn't look particularly useful to me but, in case it makes sense to you, here it is.

gdb simp simp.core
GNU gdb (GDB) 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from simp...
(No debugging symbols found in simp)
[New process 3260]
[New process 2419]
[New process 2418]
[New process 3452]
Core was generated by `simp'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007856c70ae407 in ?? ()
[Current thread is 1 (process 3260)]
(gdb) bt
#0  0x00007856c70ae407 in ?? ()
#1  0x00007856c875821f in ?? ()
#2  0x0000000001c703b8 in ?? ()
#3  0x0000000001c703b8 in ?? ()
#4  0x0000000000000000 in ?? ()

Additional information:

~> uname -rsv
NetBSD 10.99.10 NetBSD 10.99.10 (GENERIC) #0: Sat Jun  1 15:54:40 UTC 2024  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC
~> rustc --version
rustc 1.78.0 (9b00956e5 2024-04-29) (built from a source tarball)

If you don't mind a separate question, I've been having issues with Ctrl+Oor, File -> Open not doing anything. Launching it from the command line with "path to file" works fine but, the major issue is that the same happens when trying to save changes, File -> Save as doesn't open the Gtk dialogue, making it impossible to save changes.

This behavior was unfortunately already present in v3.5.3. I know it used to work fine prior to that, just can't remember at which exact version this stopped to work.

Thanks for any insights.

0323pin commented 5 months ago

I've just locally reverted to v3.5.2 and the File -> Open and Ctrl+O are working fine.

Conclusion:

Hope this helps pinning down the causes.

Kl4rry commented 5 months ago

It can't really do anything about the coredump without a backtrace. Could you try building and running with debug symbols?

Kl4rry commented 5 months ago

The file picker issue might be related to switching to xdg-portals from gtk on Linux. But it should not really affect any NetBSD as I only switched on linux. Does NetBSD have support for xdg-portal file pickers or is that linux specific?

0323pin commented 5 months ago

The file picker issue might be related to switching to xdg-portals from gtk on Linux. But it should not really affect any NetBSD as I only switched on linux. Does NetBSD have support for xdg-portal file pickers or is that linux specific?

Maybe some conditional got messed-up. Can you point me to where these are defined? As said, it works on 3.5.2 but not on 3.5.3

As for xdg-portal see this problem

What do I need to do in order to build with debug symbols on?

Kl4rry commented 5 months ago

I think if just build without the --release flags you get debug symbols.

Kl4rry commented 5 months ago

As for the gtk file picker issue the only thing I can think of is that is some upstream issue with rfd. You could try downgrading the rfd version to 0.13 in the Cargo.toml and running it with the old one.

0323pin commented 5 months ago

Downgrading rfd might be a temporary solution. I can try that, when I rebuild 3.6.1 with debug symbols. If this is the case, I can apply a patch for now and open an issue with rfd.

Let's just not close this, yet.

0323pin commented 5 months ago

@Kl4rry Built from git-HEAD with cargo build,

gdb /usr/local/bin/simp simp.core
GNU gdb (GDB) 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/bin/simp...
[New process 7137]
[New process 5990]
Cannot access memory at address 0x1b
Cannot access memory at address 0x13
Cannot access memory at address 0x13
Core was generated by `simp'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000070a293fd9407 in ?? ()
[Current thread is 1 (process 7137)]
warning: Unsupported auto-load script at offset 0 in section .debug_gdb_scripts
of file /usr/local/bin/simp.
Use `info auto-load python-scripts [REGEXP]' to list them.
(gdb) bt
#0  0x000070a293fd9407 in ?? ()
#1  0x000070a2956e421f in ?? ()
#2  0x0000000002234f78 in ?? ()
#3  0x0000000002234f78 in ?? ()
#4  0x0000000000000000 in ?? ()
0323pin commented 5 months ago

As for downgrading rfd, it doesn't seem like I'm allowed to do that,

error: failed to run custom build command for `simp v3.6.1 (/home/pin/simp)`

Caused by:
  process didn't exit successfully: `/home/pin/simp/target/release/build/simp-14689108e1328d47/build-script-build` (exit status: 101)
  --- stdout
  cargo:rustc-env=GIT_HASH=cd5d8296f9acc44113282525087381aa246652af

  cargo:rerun-if-changed=Cargo.toml

  --- stderr
  2024-06-10 21:01:48.518494225 +00:00:00 [WARN] the definition for crate/cratesio/async-signal/0.2.8 has not been harvested
  2024-06-10 21:01:48.889761682 +00:00:00 [WARN] the definition for crate/cratesio/cc/1.0.99 has not been harvested
  2024-06-10 21:01:49.287125579 +00:00:00 [WARN] the definition for crate/cratesio/clap/4.5.7 has not been harvested
  2024-06-10 21:01:49.28825536 +00:00:00 [WARN] the definition for crate/cratesio/clap_builder/4.5.7 has not been harvested
  2024-06-10 21:01:49.289340753 +00:00:00 [WARN] the definition for crate/cratesio/clap_lex/0.7.1 has not been harvested
  2024-06-10 21:01:49.674365831 +00:00:00 [WARN] the definition for crate/cratesio/enumflags2/0.7.10 has not been harvested
  2024-06-10 21:01:49.674475224 +00:00:00 [WARN] the definition for crate/cratesio/enumflags2_derive/0.7.10 has not been harvested
  2024-06-10 21:01:50.680113537 +00:00:00 [WARN] the definition for crate/cratesio/icu_normalizer/1.5.0 has not been harvested
  2024-06-10 21:01:50.680227771 +00:00:00 [WARN] the definition for crate/cratesio/icu_normalizer_data/1.5.0 has not been harvested
  2024-06-10 21:01:50.680769702 +00:00:00 [WARN] the definition for crate/cratesio/idna/1.0.0 has not been harvested
  2024-06-10 21:01:52.786627821 +00:00:00 [WARN] the definition for crate/cratesio/regex-automata/0.4.7 has not been harvested
  2024-06-10 21:01:52.786735377 +00:00:00 [WARN] the definition for crate/cratesio/regex-syntax/0.8.4 has not been harvested
  2024-06-10 21:01:52.786827197 +00:00:00 [WARN] the definition for crate/cratesio/regex/1.10.5 has not been harvested
  2024-06-10 21:01:53.698749845 +00:00:00 [WARN] the definition for crate/cratesio/toml/0.8.14 has not been harvested
  2024-06-10 21:01:53.699601155 +00:00:00 [WARN] the definition for crate/cratesio/toml_edit/0.22.14 has not been harvested
  2024-06-10 21:01:53.982607127 +00:00:00 [WARN] the definition for crate/cratesio/unicode-width/0.1.13 has not been harvested
  2024-06-10 21:01:53.983965989 +00:00:00 [WARN] the definition for crate/cratesio/url/2.5.1 has not been harvested
  2024-06-10 21:01:54.977866602 +00:00:00 [WARN] the definition for crate/cratesio/windows-result/0.1.2 has not been harvested
  2024-06-10 21:01:55.618665334 +00:00:00 [WARN] the definition for crate/cratesio/winnow/0.6.13 has not been harvested
  2024-06-10 21:01:55.898880389 +00:00:00 [WARN] the definition for crate/cratesio/xdg-home/1.2.0 has not been harvested
  2024-06-10 21:01:55.900102021 +00:00:00 [WARN] the definition for crate/cratesio/xkeysym/0.2.1 has not been harvested
  2024-06-10 21:02:03.857792422 +00:00:00 [WARN] LicenseRef-UFL-1.0 has no license file for crate 'epaint 0.27.2'
  error: failed to satisfy license requirements
     ┌─ /home/pin/.cargo/registry/src/index.crates.io-6f17d22bba15001f/icu_collections-1.5.0/Cargo.toml:32:12
     │
  32 │ license = "Unicode-3.0"
     │            -----------
[...]
  error: failed to satisfy license requirements
     ┌─ /home/pin/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zerovec-derive-0.10.2/Cargo.toml:32:12
     │
  32 │ license = "Unicode-3.0"
     │            -----------

  2024-06-10 21:02:03.87405731 +00:00:00 [ERROR] encountered 19 errors resolving licenses, unable to generate output
  thread 'main' panicked at build.rs:58:9:
  assertion failed: exit_status.success()
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
0323pin commented 5 months ago

@Kl4rry Please feel free to add anything you might find relevant, https://github.com/PolyMeilex/rfd/issues/201

0323pin commented 5 months ago

@Kl4rry Ok. One thing at the time ...

Fixed simp-3.5.3 now, http://mail-index.netbsd.org/pkgsrc-changes/2024/06/11/msg301701.html 😃

This patch and a few crate dependency changes were enough.

$NetBSD: patch-Cargo.toml,v 1.1 2024/06/11 08:36:49 pin Exp $

RFD crate changed the default features from Gtk to xdg-desktop-portal.
This breaks functionality on NetBSD. Restore the Gtk feature for now.

--- Cargo.toml.orig 2024-06-11 06:29:01.483993198 +0000
+++ Cargo.toml
@@ -48,7 +48,7 @@ wgpu = { version = "0.19.1", features = 
 winit = { version = "0.29.10", features = ["rwh_05"] }

 [target.'cfg(not(linux))'.dependencies]
-rfd = "0.14.0"
+rfd = { version = "0.14.0", default-features = false, features = ["gtk3"] }

 [target.'cfg(linux)'.dependencies]
 rfd = { version = "0.14.0", default-features = false, features = ["xdg-portal"] }

Now, let's see if applying this to 3.6.1 will help ..

Btw, what's the MSRV for 3.6.1? I see 1.78.0 for the CI but, what's actually needed to compile?

Kl4rry commented 5 months ago

The only official MSRV is the latest stable at the time a release was made.

Kl4rry commented 5 months ago

I have added the patch in 4903e2b

Kl4rry commented 5 months ago

I exposed features for controlling rfd backend. If you set default-features = false and enable the gtk3 feature the filepicker should work again.

0323pin commented 5 months ago

@Kl4rry Thanks. This should do it for now. I was about to try to use zenity, as suggested on thevrfd issue and just build with the xdg-portal backend. But, unfortunately my file system is corrupted after a system freeze 😞

It will take me quite sometime to get my system back to a working state, given that I build everything from source. Sorry!

After this, I neee to check if the coredumps are still there.

0323pin commented 5 months ago

@Kl4rry My system is up-and-running again.

Just built simp from git-HEAD and the Gtk file picker is working fine. Thanks! But, the coredumps are still happening on closing the program.

~> gdb /usr/local/bin/simp simp.core
GNU gdb (GDB) 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/bin/simp...
[New process 15907]
[New process 15226]
[New process 12136]
[New process 17287]
[New process 14777]
[New process 18815]
[New process 17466]
[New process 7740]
Core was generated by `simp'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007cdde8428407 in ?? ()
[Current thread is 1 (process 15907)]
warning: Unsupported auto-load script at offset 0 in section .debug_gdb_scripts
of file /usr/local/bin/simp.
Use `info auto-load python-scripts [REGEXP]' to list them.
(gdb) bt
#0  0x00007cdde8428407 in ?? ()
#1  0x00007cddea79141f in ?? ()
#2  0x00000000023d0c18 in ?? ()
#3  0x00000000023d0c18 in ?? ()
#4  0x0000000000000000 in ?? ()
Kl4rry commented 5 months ago

I don't really know what to do about this segfault as it does not happen on any other platform as far as I know. My only guess is that it does not originate in Rust code as it should have a proper backtrace then.

Kl4rry commented 5 months ago

Are you running on x11 or wayland?

0323pin commented 5 months ago

Are you running on x11 or wayland?

x11

Kl4rry commented 5 months ago

Is possible to test it on wayland on NetBSD?

0323pin commented 5 months ago

Nope, not really but, I can test on x11 on Linux, Void musl (non-GNU).

Wayland on NetBSD has no wlroots, it's swc based and only works with US keyboard and the velox window manager.

Kl4rry commented 5 months ago

I think the only real option is to do a git bisect to figure out which commit added the bug.

0323pin commented 5 months ago

I don't really know what to do about this segfault as it does not happen on any other platform as far as I know.

Tested on Void musl and I don't get any coredump. But, then I remember, I think Linux has coredumps disable by default these days. So, it could be we are just not seeing them.

I think the only real option is to do a git bisect to figure out which commit added the bug.

Guess you mean building a binary from every commit in between v3.5.3 and v3.6.0. This is doable but, will take sometime to go through.

Kl4rry commented 5 months ago

Bisect is a bit smarter then building every commit. You do a binary search from a known good commit to a known bad commit. It requires doing far less rebuilds.

0323pin commented 5 months ago

@Kl4rry Sorry for the slight delay. I've done git bisect and this is the result:

2024-06-19-152205_1366x768_scrot

Reverting d5413331b3ea520c67dcd9cccc4acd39a0063069 on the current git-HEAD yields a compiled binary that does not core dump.

Kl4rry commented 5 months ago

Interesting that exiting the event loop causes a segfault.

0323pin commented 5 months ago

Thank you!