Closed Neustradamus closed 5 years ago
It is linked to libotr/libgcrypt, it is possible to have help? @tehnick has tried to debug it without success.
@jeroen, @jkivilin, @dd9jn, @dkg, @dtzWill, @mstorsjo, @smuellerDD @arlolra, @dgoulet @claucece @DrWhax @juniorz
Thanks in advance
Mmm.. interesting. I don't use Windows.. but if I find a way to reproduce this.. I'll let you know :)
I don't use Windows.. but if I find a way to reproduce this..
Personally I prefer to debug MS Windows related bugs in WINE first. And this bug is 100% reproducible in 64-bit version of WINE.
Some additional info:
posix
threads and seh
exception handling. In this case this bug is 100% reproducible.sjlj
exception handling and have faced with completely another bug: generation of new key never ends. But in this case I am not sure that problem is not in our plugin. (We use QtConcurrentRun
for calling of related functions from libotr
.)I'm not familiar with Psi, is otr plugin packaged with the psi bundle(https://psi-im.org/download/)?
Are you using mingw-w64?
is otr plugin packaged with the psi bundle
Yes.
The most recent builds of Psi+ are here. From time to time I make Psi IM builds from master branch and upload them here. Difference between Psi+ and Psi IM and our development model are documented here.
Are you using mingw-w64?
Yes. I use MXE + Sibuserv projects for cross-compilation for MS Windows. Build scripts are here.
But I do not think that there is need in debugging of full Psi+ program + OTR plugin: we may prepare more simple example for reproducing of the bug based on small piece of code from plugin.
BTW, our builds using MSVC had similar bug as far as I remember. But they were prepared and tested by other people, so I am not sure.
@tehnick Thanks for this! I'll check it out tomorrow.
BTW, there is a weird ASLR bug (with a workaround its fixable!) with mingw-w64 that renders ASLR.. useless: https://insights.sei.cmu.edu/cert/2018/08/when-aslr-is-not-really-aslr---the-case-of-incorrect-assumptions-and-bad-defaults.html
But that might need a seperate ticket..
@DrWhax Thanks for info. I have added workaround proposed in this article. This will be enough I hope.
@DrWhax: Have you checked? Have you detected the problem? What is your point of view?
Hey @tehnick !
I saw that you have a workaround.. is that enough? It seems like this is mostly a libgrcrypt problem. If you try to install that library, does it succeed?
Thanks!
I saw that you have a workaround.. is that enough?
What workaround are you talking about?
It seems like this is mostly a libgrcrypt problem.
We not sure if this a libotr or libgrcrypt problem, because libotr may use libgrcrypt in a wrong way.
If you try to install that library, does it succeed?
Sorry, I didn't understand this. Could you explain what do you mead in more details?
I saw that you have a workaround.. is that enough?
If you are talking about https://github.com/psi-im/plugins/issues/45#issuecomment-455918370 that was completely another issue, not related to OTR. Sorry for off-topic comments here.
Hey,
What workaround are you talking about?
This one: "I have added workaround proposed in this article. This will be enough I hope."
We not sure if this a libotr or libgrcrypt problem, because libotr may use libgrcrypt in a wrong way.
Ok, let me clarify, there is still a crash when generating new keys or starting a conversation, right? In windows64, right?
If you are talking about #45 (comment) that was completely another issue, not related to OTR. Sorry for off-topic comments here.
No problem.
If you have some debug logs of when the crash happens, paste them here. It can be a problem within threads of the UI and the generation of the keys.
Thanks!
Ok, so I have looked at the code. For me, this can be a problem with threads. I see that on the main thread 'otrl_privkey_generate_start' and 'otrl_privkey_generate_finish' are called. In the back, 'otrl_privkey_generate_calculate' is called. Or at least that it how it should behave, according to the documentation in libotr. if this is related to libgcrypt, the place where it can fail is: https://github.com/psi-im/plugins/blob/master/generic/otrplugin/src/otrinternal.cpp#L851, as 'otrl_privkey_generate_calculate' is the only one that calles libgcrypt funcs. What can be tried is:
@tehnick if you can send me the log of when it crashes, it will be super useful.
Thanks!
Ok, let me clarify, there is still a crash when generating new keys or starting a conversation, right?
When interlocutor tries to start OTR session with you and you do not have a key for current account special dialog window will be opened with suggestion to generate a key. So the exactly the same code is called in both cases.
In windows64, right?
Yes, the problem is reproducible only in 64-bit MS Windows executables. And possibly only in MinGW builds.
If you have some debug logs of when the crash happens, paste them here. It can be a problem within threads of the UI and the generation of the keys.
Unfortunately currently I have debug build of program with debug symbols only for Psi+ and its plugins, but not for libraries which it uses. So the backtrace is not enough imformative. You may look on it here.
Unfortunately currently I have debug build of program with debug symbols only for Psi+ and its plugins, but not for libraries which it uses. So the backtrace is not enough imformative. You may look on it here.
It is still very useful. I'll check if something is strange there.
Thanks!
Ok. I have reviewed the log. The important parts are:
No symbol table info available.
#6 0x000000001eee130f in OtrInternal::create_privkey (
this=this@entry=0x1ba6ee60,
accountname=0x1c000708 "{4d70ff68-ac36-4c21-9872-448d352d1e19}",
protocol=protocol@entry=0x1eef68a0 "prpl-jabber")
at /home/boradmin/Tmp/Psi+/psi-plus-snapshots/src/plugins/generic/otrplugin/
src/otrinternal.cpp:854
qMB = <incomplete type>
newkeyp = 0x1c0294e0
loop = <incomplete type>
watcher = {<QFutureWatcherBase> = {<No data fields>}, m_future = {
d = {<QFutureInterfaceBase> = {<No data fields>}, <No data fields>}}
}
future = {
d = {<QFutureInterfaceBase> = {<No data fields>}, <No data fields>}}
fingerprint = "\020}T\033\000\000\000\000Т\020wf\000\000\000\000\bnX\033
\000\000\000\000'ьvf\000\000\000\000\000\001\000\000\000\000\000\000>cШ\025"
#7 0x000000001eee24ff in OtrInternal::generateKey (this=0x1ba6ee60,
account=...)
at /home/boradmin/Tmp/Psi+/psi-plus-snapshots/src/plugins/generic/otrplugin/
src/otrinternal.cpp:773
It fails at loop.exec();
. Unsure why; but this makes me think it is not a problem from libgcrypt or libotr.
Sorry for delay with reply: I was busy with other tasks in the project.
It fails at loop.exec();. Unsure why; but this makes me think it is not a problem from libgcrypt or libotr.
Sorry, but your assumption is wrong. Probably you do not know how QtConcurrent
works...
This is quite simple layer which allows to call a functions in a separate thread. Race conditions and other thread specific errors are excluded, because all functions in that thread are called in series according to the pool of "events" (functions and their arguments) prepared before start of that thread.
Debug note about loop.exec();
means exactly that a separate thread was created and QtConcurrent
began to call functions from his event loop.
Segfault had happened in that new thread. The most important debug notes are:
[Switching to Thread 2944.0x554]
_gcry_mpih_lshift () at mpih-lshift-asm.S:46
46 mpih-lshift-asm.S: No such file or directory.
(gdb) thr a all bt full
Thread 27 (Thread 2944.0x554):
#0 _gcry_mpih_lshift () at mpih-lshift-asm.S:46
No locals.
#1 0x000000001bf6d7e0 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
I have prepared a minimal reproducible example for you as promised: https://github.com/tehnick/example-of-libotr-crash
BTW it is written in plain C
...
Here are statically linked 32- and 64-bit executables for MS Windows: https://tehnick.net/tmp/test-libotr-1.0_x64.exe https://tehnick.net/tmp/test-libotr-1.0.exe Unfortunately, they are still without debug symbols...
gdb log for test-libotr-1.0.exe
: https://susepaste.org/view/raw/47951425
gdb log for test-libotr-1.0_x64.exe
: https://susepaste.org/view/raw/85746535
All executables are built using MXE project.
Hey!
Sorry, but your assumption is wrong. Probably you do not know how
QtConcurrent
works...
Sure. I'm not a developer of this project or C++.
As the debug says:
[Switching to Thread 2944.0x554]
_gcry_mpih_lshift () at mpih-lshift-asm.S:46
46 mpih-lshift-asm.S: No such file or directory.
You are missing the mpih-lshift-asm.S file. Can you check if:
config.status: linking mpi/amd64/mpih-lshift.S to mpi/mpih-lshift-asm.S
. As libotr does not come with libgcrypt, you should be installing it to build it. Maybe you just need to upgrade the version for libgcrypt. Or if not, submit an issue to the MINGW-packages: https://github.com/msys2/MINGW-packages/issues or libgcrypt: https://dev.gnupg.org/
That is why I said:
It seems like this is mostly a libgrcrypt problem. If you try to install that library, does it succeed?
Have you try on only builiding libcrypt (or libotr) and checking if the tests pass? Have you try building them from source?
We not sure if this a libotr or libgrcrypt problem, because libotr may use libgrcrypt in a wrong way.
If libotr can't find a file, then this looks like a libgcrypt linking problem.
And, if you manage to install libgrcrypt and build it, can you paste the build output?
As I noted above:
All executables are built using MXE project.
This project allows to easily prepare SDK with subset of libraries you need for building of applications.
Current versions of libraries:
libgcrypt | 1.8.4
libgpg_error | 1.35
libotr | 4.1.1
All libraries are built from sources taken from upstreams. Here are build rules for them:
You are missing the mpih-lshift-asm.S file.
Hmm, are you sure it is required for release (not debug) version of library?
It exists in the libgrcrypt library version you are using
I do not know how to check this in a binary file. (I completely forgot everything about ASM and low-level programming...)
Is been linked correctly: something like
config.status: linking mpi/amd64/mpih-lshift.S to mpi/mpih-lshift-asm.S
.
Ok, I'll look at build logs.
And, if you manage to install libgrcrypt and build it, can you paste the build output?
Yes, sure:
Also I see the additional patch for libgcrypt
in MXE project: libgcrypt-1-fixes.patch.
Also I see the additional patch for
libgcrypt
in MXE project: libgcrypt-1-fixes.patch.
Wow! I have found the root of problem: if I drop ATTR_ABI related part of patch then segfault is not happen.
@claucece Thanks a lot for your help!
Hola Sofía.
I'm just curious. Is it you
https://mobile.twitter.com/i/web/status/985890984939671552 ?
I'm going to implement secure dtls based based file transfer for Psi. Do you want to join?
Wow! I have found the root of problem: if I drop ATTR_ABI related part of patch then segfault is not happen.
Nice @tehnick ! Everything working now?
Hola Sofía. I'm just curious. Is it you https://mobile.twitter.com/i/web/status/985890984939671552 ?
Yeah, that is me.
I'm going to implement secure dtls based based file transfer for Psi. Do you want to join ?
Interesting. I don't have much time; but if can collaborate on something, let me know. I'll be definetly interested in joining.
Interesting. I don't have much time; but if can collaborate on something, let me know. I'll be definetly interested in joining.
Yeah. Time is always a problem when we are talking about free projects :) Btw, applying otv4 would nice too. If you could investigate a possibility of migration to the new protocol, that would be great. Probably we could propose required changed to XSF then. Btw do you speak Spanish tambien?
Anyway join our xmpp conference at psi-dev@conference.jabber.ru
Nice @tehnick ! Everything working now?
Yes. I'll close this issue once my PR in MXE will be merged.
Thanks a lot to @claucece for this help about this "big" bug in libgcrypt from @mxe and not only in @psi-im / @psi-plus.
And @tehnick for all tests requested since a very long time (more one year), and for changes in the @mxe libgcrypt-1-fixes.patch.
@ri0n I have already requested for OTRv4 in Psi (in more than OTRv3 of course) ah ah For have not only the Pidgin client: https://github.com/otrv4/.
Yeah. Time is always a problem when we are talking about free projects :) Btw, applying otv4 would nice too. If you could investigate a possibility of migration to the new protocol, that would be great. Probably we could propose required changed to XSF then.
Yes! That would we awesome! Once we are done with the design of OTRv4, we can start implementing :) I'll check what is needed.
Btw do you speak Spanish tambien?
Pero, claro! :)
Anyway join our xmpp conference at psi-dev@conference.jabber.ru
I will :)
Thanks a lot to @claucece for this help about this "big" bug in libgcrypt from @mxe and not only in @psi-im / @psi-plus.
Np :)
@Ri0n I have already requested for OTRv4 in Psi (in more than OTRv3 of course) ah ah For have not only the Pidgin client: https://github.com/otrv4/.
Yes! :)
Since a long time, there is a problem with OTR on Windows x64.
It is impossible to:
No problem on Windows x86.