Closed tpgxyz closed 8 years ago
Does it still segfault if you configure it without SIMD?
What version of nasm are you using? nasm 2.11.09 is known to generate crashing code.
We are running nasm-2.12. Based on some comments on our ML, culprit can be -flto which we are using by default.
I'm also having this problem with only certain input files; I've attached an example.
I recently updated OS X from Yosemite to El Capitan and rebuilt mozjpeg. I'm not sure if that's related to the cause but the timing feels fishy.
Tried these two versions of NASM:
$ nasm -version
NASM version 2.11.04 compiled on Apr 1 2016
$ nasm -version
NASM version 2.12.01 compiled on Apr 21 2016
Here's as far as I got debugging:
$ lldb cmozjpeg
(lldb) target create "cmozjpeg"
Current executable set to 'cmozjpeg' (x86_64).
(lldb) r -outfile logo.moz.jpg logo.weird.jpg
Process 77095 launched: '/Users/nidoyle/Prefix/bin/cmozjpeg' (x86_64)
Process 77095 stopped
* thread #1: tid = 0x87bd29, 0x000000010004e086 libjpeg.62.dylib`jsimd_ycc_rgb_convert_sse2.rowloop + 6, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xef90080af98)
frame #0: 0x000000010004e086 libjpeg.62.dylib`jsimd_ycc_rgb_convert_sse2.rowloop + 6
libjpeg.62.dylib`jsimd_ycc_rgb_convert_sse2.rowloop:
-> 0x10004e086 <+6>: movq (%rsi), %rsi
0x10004e089 <+9>: movq (%rbx), %rbx
0x10004e08c <+12>: movq (%rdx), %rdx
0x10004e08f <+15>: movq (%rdi), %rdi
(lldb) bt
* thread #1: tid = 0x87bd29, 0x000000010004e086 libjpeg.62.dylib`jsimd_ycc_rgb_convert_sse2.rowloop + 6, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xef90080af98)
* frame #0: 0x000000010004e086 libjpeg.62.dylib`jsimd_ycc_rgb_convert_sse2.rowloop + 6
frame #1: 0x0000001d00096027
frame #2: 0x000000010003bc2c libjpeg.62.dylib`sep_upsample + 268
frame #3: 0x0000000100033a3f libjpeg.62.dylib`process_data_context_main + 543
frame #4: 0x000000010002b623 libjpeg.62.dylib`jpeg_read_scanlines + 115
frame #5: 0x00000001000011ca cmozjpeg`main + 954
frame #6: 0x00007fff94b0a5ad libdyld.dylib`start + 1
./configure --without-simd
. If that avoids crashes, then problem may be in integration between nasm and the compiler.The problem doesn't happen on a freshly built libjpeg-turbo.
$ djpeg -version
libjpeg-turbo version 1.4.90 (build 20160422)
$ clang --version
Apple LLVM version 7.3.0 (clang-703.0.29)
Target: x86_64-apple-darwin15.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
$ nasm -version
NASM version 2.12.01 compiled on Apr 21 2016
The problem only happens with mozjpeg.
Seems to be an issue with JPEG decoding. Using --without-simd
does make the segfault go away but that is a workaround as opposed to a fix.
Likely this is the fix: https://github.com/libjpeg-turbo/libjpeg-turbo/commit/498d9bc92fcf39124b6f08e57326944dedd2ddd6
It because mozjpeg does not merge from libjpeg-turbo much, and lacks all the bug fixes it gets.
Great. I'll prepare merge from upstream.
That patch applied relatively cleanly for me and solved the problem. Looking forward to seeing it in mozjpeg in the future.
I can confirm that the change from libjpeg-turbo/libjpeg-turbo#20 is applying fine to mozjpeg 3.1 and fixes a segfault in jsimd_ycc_rgb_convert_sse2.rowloop
I was seeing in vips 8.3.0 on Mac OS X 10.11.4:
Thread 5 Crashed:: worker
0 libjpeg.8.dylib 0x000000011097aca6 jsimd_ycc_rgb_convert_sse2.rowloop + 6
1 ??? 0x00007fc8040635a8 0 + 140497037702568
2 libjpeg.8.dylib 0x000000011096b048 sep_upsample + 261
3 libjpeg.8.dylib 0x0000000110963967 process_data_context_main + 533
4 libjpeg.8.dylib 0x000000011095d570 jpeg_read_scanlines + 115
5 libvips.42.dylib 0x0000000110523042 read_jpeg_generate + 244
I'm going to prepare a patch to the mozjpeg homebrew formula until a new release gets out.
@pornel Any chance of getting a new version tagged once the patch is merged to master? Thanks!
I've merged libjpeg-turbo 1.4.x branch into mozjpeg.
https://github.com/pornel/mozjpeg/tree/libjpeg-turbo
The relevant diff:
Have you tested the merged branch?
I've build mozjpeg with that patch mentioned by @dwbuiten Looks like it work on production system.
I'd like somebody to test the branch at:
https://github.com/pornel/mozjpeg/tree/libjpeg-turbo
which is not just that one patch, but brings mozjpeg up to date with libjpeg-turbo, so it's a major change.
I can check it out, probably not today though. I'll report back here when I'm done.
@pornel Will do that on Monday after my holidays :)
@pornel I compared your branch and mozjpeg-3.1 with patch from libjpeg-turbo/libjpeg-turbo@498d9bc with the following test performed with the ruby vips8 gem against libvips master linked to either version:
All the generated files where exactly the same. I also measured the total execution time and it was roughly the same.
Btw. if you're wondering why I'm storing RGBA images as CMYK jpegs (without colorspace conversion), it's because no other current image format I know of comes close to jpegs performance and it supports lower resolution decoding. I wonder why no one ever added a RGBA colorspace to the JPEG standard to do just that. Blender was using that technique to store textures with alpha channel 14 years ago, so it's not exactly a novel idea. Instead lot's of different image formats where invented that compressed slightly better than jpeg at the expense of much higher encoding and memory costs…
Thanks @felixbuenemann!
BTW: JPEG now officially supports alpha via JPEG-XT extensions, which are sneaked in in APP markers for backwards compatibility.
Looks like this has been resolved with the recent libjpeg-turbo 1.4.2 merge.
Hi,
we have noticed that mozjpeg compiled with clang-3.8.0 causes many segfaults i.e. firefox, dolphin. https://issues.openmandriva.org/show_bug.cgi?id=1560 https://issues.openmandriva.org/show_bug.cgi?id=1559
Sources for uour mozjpeg package can be found here https://github.com/OpenMandrivaAssociation/mozjpeg