ImageOptim / mozjpeg-rust

Safe Rust wrapper for the MozJPEG library
https://lib.rs/mozjpeg
Other
75 stars 19 forks source link

Encoding fails on macOS arm64 (bus error) #21

Closed bvibber closed 3 years ago

bvibber commented 3 years ago

Hi, I'm working on an HDR-to-SDR conversion utility and am using mozjpeg to write output JPEGs; it's working well on most systems I've tested including macOS x86_64 and Linux arm64, but fails with a "bus error" at runtime on macOS arm64 (MacBook Air with M1 processor, running macOS 11.3.1).

Here's my calling code, which seems fairly straightforward:

// mozjpeg is much faster than image crate's encoder
std::panic::catch_unwind(|| {
    use mozjpeg::{Compress, ColorSpace};
    let mut c = Compress::new(ColorSpace::JCS_EXT_RGB);
    c.set_size(data.width, data.height);
    c.set_quality(95.0);
    c.set_mem_dest(); // can't write direct to file?
    c.start_compress();
    if !c.write_scanlines(data.bytes()) {
        panic!("error writing scanlines");
    }
    c.finish_compress();
    let mut writer = File::create(filename).expect("error creating output file");
    let data = c.data_as_mut_slice().expect("error accessing JPEG output buffer");
    writer.write_all(data).expect("error writing output file");
}).map_err(|_| JpegWriteFailure)

For now I'm working around it by using the image crate's much slower encoder if building on macOS or iOS arm64, but this isn't ideal. :)

kornelski commented 3 years ago

Can you run it under lldb and get more info about it? What's the backtrace? Does it crash because of an unsupported instruction in the SIMD assembly?

bvibber commented 3 years ago

Ok, I got a backtrace:

% lldb target/release/hdrfix
(lldb) target create "target/release/hdrfix"
Current executable set to '/Users/brion/src/png/hdrfix/target/release/hdrfix' (arm64).
(lldb) run '--auto-exposure=99.9%' '--hdr-max=99.9%' samples/sunrise-hdr.jxr samples/sunrise-sdr.jpg
Process 36767 launched: '/Users/brion/src/png/hdrfix/target/release/hdrfix' (arm64)
samples/sunrise-hdr.jxr -> samples/sunrise-sdr.jpg
read_input in 282.043 ms
input histogram in 67.052 ms
hdr_to_sdr in 112.04899999999999 ms
output mapping in 138.83 ms
Process 36767 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x10016b9e0)
    frame #0: 0x000000010016b9e0 hdrfix`jsimd_extrgb_ycc_convert_neon
hdrfix`jsimd_extrgb_ycc_convert_neon:
->  0x10016b9e0 <+0>:  adrp   x13, -2
    0x10016b9e4 <+4>:  add    x13, x13, #0xad0          ; =0xad0 
    0x10016b9e8 <+8>:  ld1.8h { v0, v1 }, [x13]
    0x10016b9ec <+12>: ldr    x5, [x2]
Target 0: (hdrfix) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x10016b9e0)
  * frame #0: 0x000000010016b9e0 hdrfix`jsimd_extrgb_ycc_convert_neon
    frame #1: 0x0000000100047b9c hdrfix`pre_process_data + 172
    frame #2: 0x000000010003ece0 hdrfix`process_data_simple_main + 152
    frame #3: 0x000000010002c01c hdrfix`jpeg_write_scanlines + 168
    frame #4: 0x000000010002b5a8 hdrfix`mozjpeg::compress::Compress::write_scanlines::h1290f651faff200f + 200
    frame #5: 0x000000010001512c hdrfix`hdrfix::write_jpeg::_$u7b$$u7b$closure$u7d$$u7d$::h65f4ae8a0aa7b268 + 116
    frame #6: 0x00000001000078dc hdrfix`hdrfix::hdrfix::h4a9b9ed18ba6bae1 + 6348
    frame #7: 0x0000000100008b2c hdrfix`hdrfix::main::ha02037256bca89c3 + 2252
    frame #8: 0x000000010001105c hdrfix`std::sys_common::backtrace::__rust_begin_short_backtrace::h0913107a508c6b04 + 12
    frame #9: 0x0000000100011078 hdrfix`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h83c60022b43a8ea0 + 12
    frame #10: 0x00000001000f51b0 hdrfix`std::rt::lang_start_internal::h0b39a9399182a96d [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h34504deaac765d35 at function.rs:259:13 [opt]
    frame #11: 0x00000001000f51a4 hdrfix`std::rt::lang_start_internal::h0b39a9399182a96d [inlined] std::panicking::try::do_call::hb1a7f41a59b76986 at panicking.rs:379 [opt]
    frame #12: 0x00000001000f51a4 hdrfix`std::rt::lang_start_internal::h0b39a9399182a96d [inlined] std::panicking::try::h356bd4e8503e0810 at panicking.rs:343 [opt]
    frame #13: 0x00000001000f51a4 hdrfix`std::rt::lang_start_internal::h0b39a9399182a96d [inlined] std::panic::catch_unwind::h4fbf2cb7aea9bf26 at panic.rs:431 [opt]
    frame #14: 0x00000001000f51a4 hdrfix`std::rt::lang_start_internal::h0b39a9399182a96d at rt.rs:51 [opt]
    frame #15: 0x0000000100009464 hdrfix`main + 44
    frame #16: 0x0000000189018420 libdyld.dylib`start + 4

(lldb) register read
General Purpose Registers:
        x0 = 0x0000000000000d70
        x1 = 0x000000016fdfe200
        x2 = 0x0000000101011cd0
        x3 = 0x0000000000000000
        x4 = 0x0000000000000002
        x5 = 0x000000010016b9e0  hdrfix`jsimd_extrgb_ycc_convert_neon
        x6 = 0x0000000000000008
        x7 = 0x0000000000000000
        x8 = 0x00000000ffffffff
        x9 = 0x000000010004c6ec  hdrfix`jsimd_rgb_ycc_convert
       x10 = 0x000000016fdfe200
       x11 = 0x0000000000000021
       x12 = 0x00000001ffc8fea6  
       x13 = 0x000000000000002a
       x14 = 0x0000000000000881
       x15 = 0x000000000000000c
       x16 = 0x0000000189043560  libsystem_platform.dylib`_platform_memset_pattern16
       x17 = 0x0000000100229f24  dyld`ImageLoaderMachO::getLazyBindingInfo(unsigned int&, unsigned char const*, unsigned char const*, unsigned char*, unsigned long*, int*, char const**, bool*) + 148
       x18 = 0x0000000000000000
       x19 = 0x0000000101011cc0
       x20 = 0x0000000101012494
       x21 = 0x0000000000000002
       x22 = 0x000000016fdfe2f8
       x23 = 0x0000000000000008
       x24 = 0x000000016fdfe1cc
       x25 = 0x0000000101012480
       x26 = 0x000000016fdfe200
       x27 = 0x0000000000028500
       x28 = 0x000000010010dc94  hdrfix`core::fmt::float::_$LT$impl$u20$core..fmt..Display$u20$for$u20$f64$GT$::fmt::h63c909e74b75b640 at float.rs:165
        fp = 0x000000016fdfe160
        lr = 0x0000000100047b9c  hdrfix`pre_process_data + 172
        sp = 0x000000016fdfe0e0
        pc = 0x000000010016b9e0  hdrfix`jsimd_extrgb_ycc_convert_neon
      cpsr = 0xa0000000

Code is current version of https://github.com/brion/hdrfix with the config hack in write_jpeg function to use image crate on Darwin/arm64 disabled, built for release target with cargo.

kornelski commented 3 years ago

Yup, looks like incompatibility in ARM assembly. You can probably work around this by disabling the default with_simd feature by including mozjpeg with default-features = false option.

libjpeg-turbo is currently rewriting the old assembly files to use ARM intrinsics, so this is probably going to be fixed the next time I sync mozjpeg with libjpeg-turbo.

bvibber commented 3 years ago

Thanks, I'll try that as a workaround for now!

kornelski commented 3 years ago

I've disabled simd on Apple-ARM in 0.12.4