Closed TinoDidriksen closed 6 years ago
Hi Tino Didriksen
I think it might be due to this change https://github.com/apertium/lttoolbox/commit/e2135c7fccadb19becff2aeb637920d1737c7fd5#diff-415c5fbb5f00468526da3b5270538172L60 . stod might be causing segfault for wide strings.
On Sun 5 Aug, 2018, 1:52 PM Tino Didriksen, notifications@github.com wrote:
There is a massive segfault / memory leak somewhere in the weight code. After upgrading to it, translations started randomly overloading, with some part of lttoolbox eating the APy machine's whole 32 + 64 GB RAM in seconds and then dying. Haven't taken the time to isolate it yet - for now, I've rolled back the install.
[29857716.421446] lt-proc[30665]: segfault at 824100 ip 00007f728d000bbc sp 00007fffe8c3f6b0 error 4 in liblttoolbox3-3.4.so.1.0.0[7f728cfa8000+6c000]
(ping @Techievena https://github.com/Techievena)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apertium/lttoolbox/issues/27, or mute the thread https://github.com/notifications/unsubscribe-auth/AQeItr-RGhhDPUIA7IfTZpB-uloKcwBEks5uNqtPgaJpZM4VvSdv .
[image: Mailtrack] https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality6& Sender notified by Mailtrack https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality6& 08/05/18, 3:56:20 PM
No, stod()
is well-defined for wstring. Also, the segfault happens at runtime during translation for old already built pairs. New weights are not part of the run.
Given an unweighted pre-compiled bin, a segfault happens in https://github.com/apertium/lttoolbox/blob/master/lttoolbox/trans_exe.cc#L120 because https://github.com/apertium/lttoolbox/blob/master/lttoolbox/fst_processor.cc#L830 unconditionally tells it to read weights even if there are no weights in the input file.
Well @Techievena, looks like I do need your help, 'cause I don't know how lttoolbox is supposed to detect that the input file is an old unweighted bin. I assume you store a flag in the new files that won't be present in the old, but can't find that in the code.
No @TinoDidriksen I am sorry there is no flag as such. I didn't know we have to take pre-compiled binary files as input. I thought we have to first compile the dictionary files every time, so default value i.e. 0.0000 will be written to the binary files even if its unweighted.
Ok then, we need to add such a flag. We absolutely cannot require that every deployment recompiles their data files - lttoolbox must be able to load both old and new files.
There is a massive segfault / memory leak somewhere in the weight code. After upgrading to it, translations started randomly overloading, with some part of lttoolbox eating the APy machine's whole 32 + 64 GB RAM in seconds and then dying. Haven't taken the time to isolate it yet - for now, I've rolled back the install.
[29857716.421446] lt-proc[30665]: segfault at 824100 ip 00007f728d000bbc sp 00007fffe8c3f6b0 error 4 in liblttoolbox3-3.4.so.1.0.0[7f728cfa8000+6c000]
(ping @Techievena)