hoehermann / purple-presage

Pidgin/libpurple plug-in for Signal messenger using presage.
GNU General Public License v3.0
7 stars 0 forks source link

Won't load on Windows #3

Open EionRobb opened 1 month ago

EionRobb commented 1 month ago

When trying to load the plugin on Windows, there's some kind of memory corruption issue. Either the debug window will show

(07:53:30) plugins: probing C:\Program Files (x86)\Pidgin\plugins\libpresage.dll (07:53:30) plugins: C:\Program Files (x86)\Pidgin\plugins\libpresage.dll is not loadable: 'C:\Program Files (x86)\Pidgin\plugins\libpresage.dll': Invalid access to memory location.

or when loading with gdb open, will just segfault:

(07:56:27) plugins: probing C:\Program Files (x86)\Pidgin\plugins\libpresage.dll

Program received signal SIGSEGV, Segmentation fault.
0x770b90f1 in ntdll!LdrProcessRelocationBlockEx () from C:\WINDOWS\SYSTEM32\ntdll.dll
(gdb) bt
#0  0x770b90f1 in ntdll!LdrProcessRelocationBlockEx () from C:\WINDOWS\SYSTEM32\ntdll.dll
#1  0x770a8011 in ntdll!LdrResolveDelayLoadsFromDll () from C:\WINDOWS\SYSTEM32\ntdll.dll
#2  0x7706130a in ntdll!RtlSetControlSecurityDescriptor () from C:\WINDOWS\SYSTEM32\ntdll.dll
#3  0x7703175c in ntdll!EtwEventUnregister () from C:\WINDOWS\SYSTEM32\ntdll.dll
#4  0x770313a1 in ntdll!EtwEventUnregister () from C:\WINDOWS\SYSTEM32\ntdll.dll
#5  0x7702e1a3 in ntdll!RtlAppendUnicodeStringToString () from C:\WINDOWS\SYSTEM32\ntdll.dll
#6  0x770261a8 in ntdll!TpCallbackMayRunLong () from C:\WINDOWS\SYSTEM32\ntdll.dll
#7  0x77037afd in ntdll!LdrLoadDll () from C:\WINDOWS\SYSTEM32\ntdll.dll
#8  0x770332e9 in ntdll!RtlInterlockedFlushSList () from C:\WINDOWS\SYSTEM32\ntdll.dll
#9  0x7707beb1 in ntdll!LdrHotPatchNotify () from C:\WINDOWS\SYSTEM32\ntdll.dll
#10 0x770379fa in ntdll!LdrLoadDll () from C:\WINDOWS\SYSTEM32\ntdll.dll
#11 0x74e18ef3 in LoadLibraryExW () from C:\WINDOWS\System32\KernelBase.dll
#12 0x74e3e8d1 in LoadLibraryW () from C:\WINDOWS\System32\KernelBase.dll
#13 0x66e41d91 in g_module_open () from C:\Program Files (x86)\Pidgin\Gtk\bin\libgmodule-2.0-0.dll
#14 0x61f58c77 in purple_plugin_probe (
    filename=filename@entry=0x33d97c8 "C:\\Program Files (x86)\\Pidgin\\plugins\\libpresage.dll") at plugin.c:257
#15 0x61f59463 in purple_plugins_probe (ext=ext@entry=0x61f9be98 <__PRETTY_FUNCTION__.46689+152> "dll")
    at plugin.c:1385
#16 0x61f4575d in purple_core_init (ui=ui@entry=0x62996c39 <__PRETTY_FUNCTION__.80152+1265> "gtk-gaim") at core.c:149
#17 0x6293c395 in pidgin_main (hint=0xc90000, argc=6, argv=0x15be470) at gtkmain.c:826
#18 0x00c928b1 in ?? ()
#19 0x015be4e8 in ?? ()
#20 0x72676f72 in ?? ()
#21 0x46206d61 in ?? ()
#22 0x73656c69 in ?? ()
#23 0x38782820 in ?? ()
#24 0x505c2936 in ?? ()
#25 0x69676469 in ?? ()
#26 0x69705c6e in ?? ()
Backtrace stopped: Cannot access memory at address 0x505c3a47

Not quite sure how to debug this one, as I've never seen this error before

hoehermann commented 1 month ago

Thank you for the report. I did not check the Windows build lately. I just assumed "if it builds, it works". So much for that… Since the problem manifests during the probing and not the actual execution, I suspect issues somewhere in the build process (compilation, linking,…). I hope I can look into it next week.

hoehermann commented 1 month ago

Unfortunately, I am unable to reproduce this behaviour. I tested the nightly offered by github and a local build in Pidgin 2.14.12 and 2.14.13 on my private Windows 10 installation and my place of work. It loaded just fine.

EionRobb commented 1 month ago

So very odd.

I downloaded a fresh copy from https://nightly.link/hoehermann/purple-presage/workflows/build/master/libpresage.dll.zip and fired up with gdb with the same backtrace. Running on Windows 11 22631.3958, with libpurple 2.14.13

Tried to compare it with other protocol plugins, but the only thing that jumped out was that dependency scanner is trying to look up libpurple functions by ordinal instead of name: image

EionRobb commented 1 month ago

Since you mentioned that you've tested with the same version of libpurple, but different versions of Windows, I went through the system dll imports and saw that ws2_32 had a lot of ordinal's imported too image

I don't know much about ordinals vs named imports, but could it potentially be that its trying to import a function that's moved to a different ordinal in a different version of the dll?

hoehermann commented 1 month ago

Today, I learned that ordinal imports exist. Thank you. :slightly_smiling_face: Curiously WS2_32.dll seems to be a popular example for ordinal imports (see this article). So I guess that is okay and just the way it is.

As for using ordinals when linking libpurple.dll, that is probably on me and the way I generate the .lib file here. I should look into this technique.

Unfortunately, I am still not able to reproduce the issue. I just tried the current nightly on Windows 11 23H2 Build 22631.2506 (I currently do not have anything more recent available). Works for me there, too. :slightly_frowning_face:

EionRobb commented 1 month ago

After trying to read about the problem some more, I've seen other stackoverflow posts talking about order of library loading in C++ code being a potential trigger (searching for "LoadLibrary error 998" seemed to bring up a few more results). With that in mind, I tried renaming the dll to aalibpresage.dll and it loads. I've then binary searched with file renames... so this works

(08:33:55) plugins: probing C:\Program Files (x86)\Pidgin\plugins\iconaway.dll
(08:33:55) plugins: probing C:\Program Files (x86)\Pidgin\plugins\iconaway_libpresage.dll
(08:33:56) plugins: probing C:\Program Files (x86)\Pidgin\plugins\icon_override.dll

but this does not

(08:36:12) plugins: probing C:\Program Files (x86)\Pidgin\plugins\iconaway.dll
(08:36:12) plugins: probing C:\Program Files (x86)\Pidgin\plugins\icon_override.dll
(08:36:12) plugins: probing C:\Program Files (x86)\Pidgin\plugins\icon_override_libpresage.dll

Program received signal SIGSEGV, Segmentation fault.

Disabling the icon_override.dll (from https://github.com/EionRobb/pidgin-icon-override/releases/download/0.1/icon_override.dll ) allows the library to load, but then crashes if renamed back to libpresage.dll again. Another binary search found that libnetnexus.dll also caused it to crash.

By this stage I was a bit nervous that adding new plugins in the future could suddenly break the loading of this plugin, so renamed it zzlibpresage.dll and went through the list of all dlls that caused it to crash if they were loaded before it:

I kinda gave up at this point as it was seeming like more libs than not were starting to conflict - my theory was that it was trying to load the dll into a memory address that was being used by another plugin? But that's a wild guess and I honestly have no idea.

So as a workaround, naming it aa_libpresage.dll gets it to load. Just makes me a bit nervous :)

hoehermann commented 1 month ago

Thank you for your wonderful research. The findings are interesting and horrific. 😱

I spent a couple of hours of installing Windows 11 in a virtual machine to have a pristine environment. Alas, I did not manage to replicate the behaviour described.

I did, however, notice some linker warnings. I updated the build instructions for more static linking so the Visual C++ Runtime dll is no longer needed. Unfortunately, this seems to work well only for Release configurations. Maybe this solves this issue, too: libpresage.dll.zip

I am glad you found a workaround. Do you have feed-back regarding the actual user experience?