Chen-tao / webm

Automatically exported from code.google.com/p/webm
0 stars 0 forks source link

Compiling vp8 with llvm 5.0 (ios7) with optimizations produce garbled video #603

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Hi,
I'm compiling libvpx for iOS7 latest beta (6). Since GCC is removed from the 
iOS SDK, I converted the configure.sh and configure scripts to use clang, and 
Xcode5 toolchain (see diff file).

When I compile normally, the decoder produce garbled output (see screenshot). 
If I remove optimizations (--disable-optimizations) and use -O0 or -O1, it 
works fine. Only -O2 and up creates this behavior.
The encoder works fine in both cases.

I'm pretty sure it's a bug with Clang and not your implementation, but since 
you know the code better, I think it will be more productive if you handle this 
matter with LLVM.

Original issue reported on code.google.com by sagiil...@gmail.com on 22 Aug 2013 at 1:46

Attachments:

GoogleCodeExporter commented 8 years ago
Is this a joke?

        880 +            if ((intptr_t)&tmp_row_min == 0x42)
        881 +              *(char*)0x1 = 1;

I can't see how this would work !

>O >O

Original comment by gbi.linp...@gmail.com on 1 Apr 2014 at 9:08

GoogleCodeExporter commented 8 years ago
Re: #51: it works because it forces the variable to be aliased and either works 
around a yet-unidentified clang bug or hides a yet-unidentified bug in libvpx.  
This is a work-around, not a fix.

Original comment by fischman@chromium.org on 1 Apr 2014 at 4:54

GoogleCodeExporter commented 8 years ago
I see this is still open.  Are you waiting on us or do you have some other work 
around?

Original comment by jimbankoski@google.com on 17 Apr 2014 at 4:26

GoogleCodeExporter commented 8 years ago
Jim: definitely waiting on you!

Original comment by fischman@chromium.org on 17 Apr 2014 at 4:36

GoogleCodeExporter commented 8 years ago

Original comment by jimbankoski@google.com on 17 Apr 2014 at 6:58

GoogleCodeExporter commented 8 years ago
Is there a fix available now? I wish to enable -03 for my libvpx build.

Original comment by shri...@bluejeansnet.com on 22 Apr 2014 at 5:26

GoogleCodeExporter commented 8 years ago
I am working on it now. I built and installed 
out_ios/Debug-iphoneos/AppRTCDemo.app on iphone. It worked the first time, but 
failed after that with error message:"Application lost focus, connection 
broken." Do you have any idea on how to make it work all the time?

Also, what debugging tool do you use? Thanks.

Original comment by yunqingw...@google.com on 23 Apr 2014 at 6:18

GoogleCodeExporter commented 8 years ago
Note that you'll have to build & use Release-iphoneos (instead of 
Debug-iphoneos) to witness the bug mentioned here.
The "application lost focus" message should be dismissable and then the app 
work again when you enter a new room's ID.  B/c of the way apprtc works you're 
best off using a new room every time (I just choose a prefix du jour and 
increment by one each time).

Original comment by fischman@chromium.org on 23 Apr 2014 at 9:02

GoogleCodeExporter commented 8 years ago
The only debugging tool I know how to use is lldb, launched using ios-deploy 
from https://github.com/phonegap/ios-deploy

Original comment by fischman@chromium.org on 23 Apr 2014 at 9:03

GoogleCodeExporter commented 8 years ago
Ami,

Thank you. It seems a new room's ID won't make the app work again.

I tried, and the app on iOS simulator(both debug & release mode) worked, which 
uses x86 code. I suspect the problem is in ARM code. A simple way to narrow the 
problem is to disable neon functions(or maybe also armv6 ones if we could not 
find the cause), and enable them one by one to see the result. The code is in 
/vp8/common/rtcd_defs.pl.

Original comment by yunqingw...@google.com on 24 Apr 2014 at 4:27

GoogleCodeExporter commented 8 years ago
Ami,

Now the fix was already checked in:
http://git.chromium.org/gitweb/?p=webm/libvpx.git;a=commit;h=33df6d1fc1d268b4901
b74b4141f83594266f041

Thanks for verifying that.

Original comment by yunqingw...@google.com on 29 Apr 2014 at 10:04

GoogleCodeExporter commented 8 years ago

Original comment by yunqingw...@google.com on 2 May 2014 at 1:05

GoogleCodeExporter commented 8 years ago
Hello,

Coming back again for a few comments on this resolution.
I was able to have the crash AND corruption fixed on all devices except for the 
iPhone 4. All other devices seem to work flawlessly with commit 
f1761a74cd7eac4deed88d672e23e8079a9c1f54  and with `-03 -fno-strict-aliasing`

Here's the backtrace I get for the iPhone4:

* thread #16: tid = 0x1a81, 0x0046709e linphone`vp8_decode + 530, stop reason = 
EXC_BAD_ACCESS (code=1, address=0x2228)
  * frame #0: 0x0046709e linphone`vp8_decode + 530
    frame #1: 0x004425fa linphone`vpx_codec_decode + 50
    frame #2: 0x003c6ef6 linphone`dec_process(f=<unavailable>) + 318 at vp8.c:719
    frame #3: 0x003a7086 linphone`ms_filter_process(f=0x05e0c790) + 30 at msfilter.c:320
    frame #4: 0x003a7fda linphone`run_graph [inlined] call_process(f=0x05e0c790) + 44 at msticker.c:228
    frame #5: 0x003a7fae linphone`run_graph(f=0x05e0c790, s=0x05ee0ed0, unschedulable=0x046a5f24, force_schedule='\0') + 130 at msticker.c:242
    frame #6: 0x003a7fa2 linphone`run_graph(f=0x05e0c930, s=0x05ee0ed0, unschedulable=0x046a5f24, force_schedule='\0') + 118 at msticker.c:247
    frame #7: 0x003a7ec4 linphone`run_graphs(s=0x05ee0ed0, execution_list=<unavailable>, force_schedule='\0') + 36 at msticker.c:261
    frame #8: 0x003a7d70 linphone`ms_ticker_run(arg=0x05ee0ed0) + 456 at msticker.c:440
    frame #9: 0x3a0b7918 libsystem_pthread.dylib`_pthread_body + 140
    frame #10: 0x3a0b788a libsystem_pthread.dylib`_pthread_start + 102

Any idea?

Original comment by gbi.linp...@gmail.com on 12 May 2014 at 8:59

GoogleCodeExporter commented 8 years ago
After investigation, it seems that the crash occurs when we feed the decoder a 
P-Frame after missing the first I-Frame.
This is actually reproductible on all iOS devices, provided that you don't feed 
the decoder an I-Frame at first iteration.

Note that I tested it on the master also.

Should I open a new bug for this?

Original comment by gbi.linp...@gmail.com on 13 May 2014 at 2:18

GoogleCodeExporter commented 8 years ago
Yes. Please file a new bug. Also, please provide detailed information on how to 
reproduce the bug. Thanks.

Original comment by yunqingw...@google.com on 13 May 2014 at 3:44

GoogleCodeExporter commented 8 years ago
I'm still seeing this with libvpx v1.3.0 with -Os and -fno-strict-aliasing 
(same with -O3 and -fno-strict-aliasing)

Removing the static keyword on decode_macroblock appears to fix the problem, as 
noted in #2. Haven't seen any crashes so far.

I've attached the disassembly with (bad.s) and without (good.s) the static on 
decode_macroblock. As alluded to many times in this thread, the problem appears 
to be with vpx_memset.

In bad.s, we have:
000010a0    f7feefae    blx _vp8_dequantize_b_neon
000010a4    f8db0b04    ldr.w   r0, [r11, #2820]
000010a8        9926    ldr r1, [sp, #0x98]
000010aa    f7feefaa    blx _vp8_short_inv_walsh4x4_neon
000010ae    f8db0b00    ldr.w   r0, [r11, #2816]
000010b2    f900aa0f    vst1.8  {d10, d11}, [r0]
000010b6        3010    adds    r0, #0x10
000010b8    f900aa0f    vst1.8  {d10, d11}, [r0]

In good.s, we have:
0000030e    f7ffee78    blx _vp8_dequantize_b_neon
00000312    f8d60b04    ldr.w   r0, [r6, #2820]
00000316    f50671c0    add.w   r1, r6, #0x180
0000031a    f7ffee72    blx _vp8_short_inv_walsh4x4_neon
0000031e    efc00050    vmov.i32    q8, #0x0
00000322    f8d60b00    ldr.w   r0, [r6, #2816]
00000326    f9400a0f    vst1.8  {d16, d17}, [r0]
0000032a        3010    adds    r0, #0x10
0000032c    f9400a0f    vst1.8  {d16, d17}, [r0]

I think it actually works completely by luck. In good.s, it's using d16 and d17 
to store the 0's (I'm guessing because the extra function call introduced 
constrains the available registers), while in bad.s, it's using d10 and d11 to 
store the 0's. In v1.3.0, even after the patch from #61, there still appears to 
be many places where the registers that must be preserved (q4-q7) are used in 
NEON functions without being saved off, e.g. in dequantizeb_neon.asm and 
dequant_idct_neon.asm.

I'm seeing 2 options: find all the places where q4-q7 are used without being 
saved and change them to either save and restore those registers or use 
different registers, or upgrade to a later version of libvpx.

I haven't verified the latest master in libvpx, but it appears that at least 
the 2 problematic NEON assembly files mentioned above have been replaced with 
intrinsics (to support ARMv8 according to the git commit messages).

Original comment by daiwe...@suitabletech.com on 17 Sep 2014 at 5:21

Attachments:

GoogleCodeExporter commented 8 years ago
Could you file a new bug since this one was already closed? Thanks.

Original comment by yunqingw...@google.com on 17 Sep 2014 at 4:00

GoogleCodeExporter commented 8 years ago
I've been able to work around it by applying all the patches from James Yu 
(swapping out NEON asm implementations with intrinsics implementations) in the 
libvpx git repo (git log --author="James Yu"). I haven't tried the latest 
version of libvpx, but I would expect the bug listed here to no longer be 
present based on what I've seen.

Do you know when the next official version of libvpx will be released? Seems 
like a patch or minor version (as described here: 
http://www.webmproject.org/code/releases/) would be nice to help out users on 
iOS.

Original comment by daiwe...@suitabletech.com on 17 Sep 2014 at 8:38

GoogleCodeExporter commented 8 years ago
+Restrict-AddIssueComment-Commit

This issue is closed and marked fix. Please create new issue.

Original comment by tomfine...@google.com on 17 Sep 2014 at 8:59