openscopeproject / TrguiNG

Remote GUI for Transmission torrent daemon
GNU Affero General Public License v3.0
279 stars 33 forks source link

build error #105

Closed username227 closed 8 months ago

username227 commented 9 months ago

Hi,

I'm not able to build this anymore. When I run "npm run build" it gives me an error compiling the appimage. When I run it with the verbose option, then I find that the problem is the following: ERROR: Could not find dependency: libicui18n.so.72

Research into this package shows that it is part of the icu package. The package has already bene upgraded to 73, and this may be what's causing the error. Unfortunately, I cannot simply downgrade the package because it'll break lots of dependencies. Anybody else getting this problem? Thanks.

username227 commented 9 months ago

Update: Even the AppImage won't run. It gives me the following output:

Gtk-Message: 23:09:07.752: Failed to load module "xapp-gtk3-module"

(trgui-ng:83998): Gtk-WARNING **: 23:09:07.766: Failed to parse /home/jerry/.config/gtk-3.0/settings.ini: Key file does not have group “Settings” Gtk-Message: 23:09:08.040: Failed to load module "xapp-gtk3-module"

(WebKitWebProcess:84027): Gtk-WARNING **: 23:09:08.045: Failed to parse /home/jerry/.config/gtk-3.0/settings.ini: Key file does not have group “Settings”

I wonder if this is somehow related to the webkit update?

qu1ck commented 9 months ago

What is your OS? I recall it's Arch but you should specify.

See if updating your nodejs fixes libicui18n problem. AppImage is not likely to work on systems other than ubuntu. But the error about xapp module may be fixed by reinstalling gir1.2-xapp or xapp-gtk or whatever the relevant package is called on arch.

username227 commented 9 months ago

Yes, it is arch. Sorry. OK so I got it to build properly after some trial and error and some hacky system-links that I found on another repository. But it still won't open. The errors are similar but not quite the same. The xapp warning seems to have gone away. But I'm getting:

(trgui-ng:105997): Gtk-WARNING **: 23:55:55.888: Failed to parse /home/jerry/.config/gtk-3.0/settings.ini: Key file does not have group “Settings”

(WebKitWebProcess:106010): Gtk-WARNING **: 23:55:55.960: Failed to parse /home/jerry/.config/gtk-3.0/settings.ini: Key file does not have group “Settings” [1] 105997 segmentation fault (core dumped) ./trgui-ng

I don't quite understand about the appimage. I thought those were supposed to be designed to be universally workable across all linux systems.

username227 commented 9 months ago

OK, so the core dump doesn't seem to be related to the GTK warnings. If I get rid of those config files completely, the GTK warnings go away and the core is still dumped. Therefore, i'm left with the following: a successful build that errors out with a "segmentation fault (core dumped)"

qu1ck commented 9 months ago

If you had to make symlinks you don't have a successful build, you are just tricking the system into thinking it has built and then you find out at runtime that it doesn't work and get a crash.

To debug it you can make a debug build and launch that. Either you get a rust panic which is more informative and tells you what went wrong or you get same silent segfault. The second means that crash happens in c code and you'll have to run it under gdb to figure out where it crashes.

I don't quite understand about the appimage. I thought those were supposed to be designed to be universally workable across all linux systems.

In theory yes, in practice it only works well if the app has no requirements for new libs outside of what appimage decides to put in the image.

See https://github.com/openscopeproject/TrguiNG/issues/91#issuecomment-1756718538

qu1ck commented 9 months ago

One more thing, if you don't need the appimage and can run native build, you can skip building appimage by running npm run build -- -b

username227 commented 9 months ago

How do I make a debug build?

qu1ck commented 9 months ago
npm run webpack-serve
npm run tauri-dev

Run these in parallel. But before that you need to undo whatever symlinks you did.

username227 commented 9 months ago

OK I'm not 100% i used gdb correctly, but I got this message:

Reading symbols from ./trgui-ng... warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts of file /home/jerry/TrguiNG/src-tauri/target/debug/trgui-ng. Use `info auto-load python-scripts [REGEXP]' to list them.

Does this mean anything to you?

qu1ck commented 9 months ago

You did not run the program and did't trigger the crash so that output is not useful. Google gdb commands (run and backtrace should be all you need to know). Before trying gdb just run the debug build normally, trigger the crash and check terminal output. Rust has decent built in crash messaging.

username227 commented 9 months ago

OK, so the debug build just gave me the same output as the regular build (and since I got rid of the symlinks, i can't build the regular build any more).

I think I wasn't using gdb correctly, though. I got a few things:

Thread 1 "trgui-ng" received signal SIGSEGV, Segmentation fault. 0x00007ffff0361db4 in pthread_mutex_lock () from /usr/lib/libc.so.6

with auto-downloading debuginfo, I got the following:

Thread 1 "trgui-ng" received signal SIGSEGV, Segmentation fault. ___pthread_mutex_lock (mutex=0x0) at pthread_mutex_lock.c:80 Downloading source file /usr/src/debug/glibc/glibc/nptl/pthread_mutex_lock.c 80 unsigned int type = PTHREAD_MUTEX_TYPE_ELISION (mutex);

qu1ck commented 9 months ago

After the crash you should switch to that thread and run backtrace command in gdb, show me the output.

But I think you are chasing wrong thing here.

  1. Did you remove your symlinks?
  2. If you are able to build debug build then build is not your issue. Did you try npm run build -- -b to get release build without appimage?
username227 commented 9 months ago

this is the output of the backtrace command:

___pthread_mutex_lock (mutex=0x0) at pthread_mutex_lock.c:80 80 unsigned int type = PTHREAD_MUTEX_TYPE_ELISION (mutex);
(gdb) backtrace

0 ___pthread_mutex_lock (mutex=0x0) at pthread_mutex_lock.c:80

1 0x00007fffe2fc66cc in _dbus_platform_cmutex_lock (mutex=)

at /usr/src/debug/dbus/dbus/dbus/dbus-sysdeps-pthread.c:153

2 _dbus_lock (lock=_DBUS_LOCK_bus)

at /usr/src/debug/dbus/dbus/dbus/dbus-threads.c:348

3 internal_bus_get (type=DBUS_BUS_SYSTEM, private=0, error=0x7ffffffe9250)

at /usr/src/debug/dbus/dbus/dbus/dbus-bus.c:431

4 0x00007fff769a6cdf in () at /usr/lib/libnvidia-eglcore.so.535.113.01

5 0x00007fff769a6f91 in () at /usr/lib/libnvidia-eglcore.so.535.113.01

You're right regarding the build. without the appimage building, it does build.

username227 commented 9 months ago

I take it back. there's more. a lot more. i'll attach a text file.

username227 commented 9 months ago

backtrace.txt

qu1ck commented 9 months ago

Thanks, this helps. Unfortunately that backtrace shows that crash is coming from nvidia driver :(

You said there was webkit update? Try downgrading that to previous version and check if that helps. Also try running other webkit dependent apps and see if they work.

If you send this backtrace to webkitgtk folks on their bugtracker they may be able to help you or offer another workaround like with that env variable from the blank screen issue.

qu1ck commented 9 months ago

Actually more details show that this may be related to the same DMA buffer issue as before. Did you try running with and without the WEBKIT_DISABLE_DMABUF_RENDERER=1 var? Does it crash in both cases? Is the backtrace different?

username227 commented 9 months ago

Yes, it crashes in both cases, but there does seem to be less on the backtrace. I'm attaching it here. backtrace2.txt

qu1ck commented 9 months ago

The second backtrace is from the wrong thread I think, not the one that crashed. Did you forget to switch?

username227 commented 9 months ago

backtrace2.txt What about this? I'm not sure what i did differently...

qu1ck commented 9 months ago

It's the same. Can you show your whole interaction with gdb, not just backtrace result? When crash happens it tells you which thread crashed, I'm guessing you are running backtrace on wrong one but can't tell without full output.

username227 commented 9 months ago

gdb trace.txt

qu1ck commented 9 months ago

Ah, you are running it on release build, of course it will look different. You should do it on debug build.

username227 commented 9 months ago

what about this?

gdb trace.txt

qu1ck commented 9 months ago

This one is more interesting, I don't see anything related to webkitgtk or nvidia driver here so this may be an actual crash from the app or one of it's libs.

What happens if you run this in terminal (not in gdb) RUST_BACKTRACE=1 /home/jerry/TrguiNG/src-tauri/target/debug/trgui-ng

username227 commented 9 months ago

Nothing, really:

[1] 66029 segmentation fault (core dumped) RUST_BACKTRACE=1 /home/jerry/TrguiNG/src-tauri/target/debug/trgui-ng

username227 commented 9 months ago

I have a backup from 10/26/23, which I believe the app was still working. In theory, I could restore to this backup if necessary to try to track down if an update to one of the dependencies may be causing the crash. However, as this would be a system-wide restore, it might be more helpful if you could try to narrow it down to a particular lib and i can trial and error by downgrading specific lib's.

qu1ck commented 9 months ago

That means the crash is coming from c++, not rust, so the app code is not the issue. It may be one of the app libs or something in system libs.

Don't restore system backup.

First thing to try to eliminate the app issue is revert back to known working release (likely v0.9.0) and try to build that.

git checkout v0.9.0
npm run build

If that does not work then you know it's something in your system and not the app. If it works then move forward commit by commit to try to find one that breaks the app. I suspect that if it is the app code then it will likely be 0f34896aa7090c0bb9b47d827574c29fc9cf1420 because it brought in new dependency but would like to confirm.

username227 commented 9 months ago

OK, yes, it did work. How do I move forward by each commit?

username227 commented 9 months ago

also, what's the new dependency it requires? Since I manually built it, there's no guarantee the dependency is on my system.

qu1ck commented 9 months ago

OK, yes, it did work.

By work you mean the app works, not just builds, right?

How do I move forward by each commit?

Look at commit log with git log --oneline --decorate and for each commit starting after v0.9.0 do git checkout <hash> and then build, test the app, as usual. For example command for first commit after v0.9.0 would be git checkout c37e831

also, what's the new dependency it requires? Since I manually built it, there's no guarantee the dependency is on my system.

libfontconfig. You would not be able to build if you didn't have it.

username227 commented 9 months ago

yes, it worked perfectly on 0.9.0. I'll let you know when I find the offending commit. :-)

username227 commented 9 months ago

OK, so the commit that broke it was 71a3859 Add "Open folder" item in torrent/file row menu.

Fascinating.

qu1ck commented 9 months ago

Ugh, yeah, that one also added dependency on libdbus. What is your version of that lib?

username227 commented 9 months ago

libdbus is part of the dbus package, according to the arch website: https://archlinux.org/packages/core/x86_64/dbus/

The package is version 1.14.10-1.

qu1ck commented 9 months ago

Yeah, same version on debian 12 and it works just fine here. I'll report this to the lib I use which relies on dbus, maybe they have some clue why this may crash.

username227 commented 8 months ago

hmm...no response yet from them. Do you think this is something within the program code itself, or something related to one of the dependents of the program or of the dependency's runtimes that is missing or set up differently?

qu1ck commented 8 months ago

This is not an issue in my code because

  1. It works on other linux machines
  2. It crashes for you on startup, not when app invokes related code

So this is either a bug in the lib where I opened the issue, a bug in rust libdbus wrapper that the lib uses to make dbus calls or something is screwed up on your system.

qu1ck commented 8 months ago

If they don't respond for another few days then I can make a patch for you to disable the functionality relying on dbus. But you will have to maintain (rebase) it yourself going forward.

username227 commented 8 months ago

I believe you are correct. I just attempted to recreate on a backup install of manjaro that I use rarely but is based on arch. I was unable to reproduce the bug. The most significant difference between the two installs was the kernel, so I tried arch's LTS kernel and still got the error. There must be something messed up about dbus, but I have no idea how to troubleshoot.

I am posting the backtrace to the arch people on reddit to see if they have any ideas. Thanks so much for your help,, BTW.

username227 commented 8 months ago

OK so I ran the program with dbus debug enabled and this is the output that I got:

#signal time=1698956492.965024 sender=:1.133 -> destination=(null destination) serial=1335 path=/org/freedesktop/systemd1; interface=org.freedesktop.systemd1.Manager; member=UnitRemoved
   string "systemd-coredump@3-7250-0.service"
   object path "/org/freedesktop/systemd1/unit/systemd_2dcoredump_403_2d7250_2d0_2eservice"

Any idea what this might mean?

qu1ck commented 8 months ago

Probably something messaging some other thing about a crashed program. Not really helpful to understand why it crashed.

username227 commented 8 months ago

OK so I accidentally crashed my system while troubleshooting this further. Typically, that only happens when I do something really stupid, and this was no exception. After reinstalling arch, the master now works just fine. Therefore, I assume that it was something wrong with something. As I continue to personalize it to my specifications, i'll keep an eye on the program to see if it reverts back, and if so, perhaps i'll figure out what's going on.

Meanwhile, though, i'll close the issue. (BTW, I appreciate the offer to fix it for me; most devs wouldn't even consider something like that).