dinhminhquoi / webm

Automatically exported from code.google.com/p/webm
0 stars 0 forks source link

Compiling vp8 with llvm 5.0 (ios7) with optimizations produce garbled video #603

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,
I'm compiling libvpx for iOS7 latest beta (6). Since GCC is removed from the 
iOS SDK, I converted the configure.sh and configure scripts to use clang, and 
Xcode5 toolchain (see diff file).

When I compile normally, the decoder produce garbled output (see screenshot). 
If I remove optimizations (--disable-optimizations) and use -O0 or -O1, it 
works fine. Only -O2 and up creates this behavior.
The encoder works fine in both cases.

I'm pretty sure it's a bug with Clang and not your implementation, but since 
you know the code better, I think it will be more productive if you handle this 
matter with LLVM.

Original issue reported on code.google.com by sagiil...@gmail.com on 22 Aug 2013 at 1:46

Attachments:

GoogleCodeExporter commented 9 years ago
Jut a small note about the patches, they're reversed, I called diff with the 
wrong order. use patch -R to apply

Original comment by sagiil...@gmail.com on 22 Aug 2013 at 1:55

GoogleCodeExporter commented 9 years ago
Hi, I am actually running into the same problem on iOS using the iOS 7 SDK. 
It's often not so bad as the posted screenshot but it does show many black and 
white boxes on top of otherwise good video.

I have compiled side-by-side the same version of libvpx with LLVM (Xcode 5.0) 
and Gcc (Xcode 4.5), and the LLVM version has this problem while the GCC 
version does not.

Additionally, I can confirm that using LLVM compiled vp8 with only 
"vp8/decodframe.c.o" copied from the GCC build works correctly. I will report 
back if I learn more after debugging the decoder.

Original comment by patr...@imo.im on 24 Sep 2013 at 12:01

GoogleCodeExporter commented 9 years ago
Just an update. I found that removing the static keyword on decode_macroblock 
in vp8/decodframe.c is sufficient to "solve" the issue even when building with 
-O3. (Patch attached)

It's still a question why optimizations cause such a problem here.

Original comment by patr...@imo.im on 24 Sep 2013 at 9:47

Attachments:

GoogleCodeExporter commented 9 years ago
Just an update for those who may have tried the patch. Despite the fact that 
the black/white spots appeared to be gone with my patch on -O3, I noticed 
random crashes in vp8 decode_macroblock. They would happen on some devices 
after decoding 2-minute or longer video files (but it might just be random).

When built with -O1, neither the crash, nor the black/white spots appear to 
happen. So you can essentially ignore my last two posts if you wish to have a 
stable iOS app.

I will also try to see if there is a difference when using the webrtc gyp-based 
build system (which includes a version of libvpx and builds using clang by 
default, on iOS). I will report back if this tells me anything useful.

Original comment by patr...@imo.im on 30 Sep 2013 at 8:08

GoogleCodeExporter commented 9 years ago
Hey man,
I'm seeing this issue as well, and removing the static linkage for 
decode_macroblock didn'ty solve the issue.
How do you call the configure script to force -O1?

    ./configure [...] --disable-optimizations CFLAGS="-O1" 

?

Original comment by gbi.linp...@gmail.com on 30 Sep 2013 at 9:02

GoogleCodeExporter commented 9 years ago
#5 
this worked for me:
./configure --target=armv7-darwin-gcc --disable-optimizations 
--extra-cflags="-O1"

Original comment by jo...@iamyellow.net on 12 Oct 2013 at 6:34

GoogleCodeExporter commented 9 years ago
XCode doesn't include the opt tool, so I can't test on Apple LLVM, but I was 
able to reproduce on a build with a LLVM 3.3 tarball using a clang/opt 
pipeline: I had to write a script in front of the compiler (attached) and hack 
configure.sh to use this script instead of 'cc' for the darwin target
clang -emit-llvm -c -o - <clang opts> | opt - -o - <pass 1 args> | opt - -o - 
<pass 2 args> | clang -x ir -c - <clang opts>

(I got the opt pass arguments by using 'llvm-as < /dev/null | opt -O3 
-disable-output -debug-pass=Arguments')

I tracked down the bad optimization flag to the -gvn flag (I also removed the 
redundant -memdep pass). I'm not so familiar with this, but it could be caused 
by aliasing in libvpx; or it could be solved in a newer release of llvm. I will 
try some stuff including LLVM from svn, and -debug-aa and other alias-related 
flags, but for those who need an extra bit of performance without the crashes, 
this might help out.

Original comment by patr...@imo.im on 14 Oct 2013 at 10:12

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by ya...@google.com on 17 Oct 2013 at 10:32

GoogleCodeExporter commented 9 years ago
Ami suggested it's due to aliasing and I should investigate our use of (int *).

Original comment by johannkoenig@google.com on 17 Oct 2013 at 10:49

GoogleCodeExporter commented 9 years ago
Enabling optimization works for me as long as I specify --extra-cflags="-O3 
-fno-strict-aliasing"

Original comment by pierre.s...@gmail.com on 18 Oct 2013 at 8:39

GoogleCodeExporter commented 9 years ago
Issue webrtc:2533 has been merged into this issue.

Original comment by fischman@webrtc.org on 21 Oct 2013 at 5:58

GoogleCodeExporter commented 9 years ago
https://gerrit.chromium.org/gerrit/67657

Temporary workaround. Probably won't fix iOS builds yet since it looks like 
it's calling 'cc' instead of clang?

Also will be adding something similar to libvpx.gyp for chromium/webrtc

Original comment by johannkoenig@google.com on 28 Oct 2013 at 10:52

GoogleCodeExporter commented 9 years ago
Johann: chromium already (always) builds with -fno-strict-aliasing: 
https://code.google.com/p/chromium/codesearch#search/&q=f:gyp%20strict-aliasing&
sq=package:chromium&type=cs

Original comment by fischman@chromium.org on 28 Oct 2013 at 11:04

GoogleCodeExporter commented 9 years ago
Then how did the issue surface for WebRTC? I guess 
https://code.google.com/p/webrtc/source/browse/trunk/third_party/libvpx/libvpx.g
yp?spec=svn3377&r=3377#214

needs another entry.

Original comment by johannkoenig@google.com on 28 Oct 2013 at 11:12

GoogleCodeExporter commented 9 years ago
Looked a bit more; looks like chromium's build/common.gypi only specifies
-fno-strict-aliasing for:
- OS==mac; and
- os_posix==1 and OS!="mac" and OS!="ios"
I suspect this is a latent long-standing bug in build/chromium.gypi and
that making the bit that applies to mac also apply to ios will fix this.

Original comment by fischman@chromium.org on 28 Oct 2013 at 11:31

GoogleCodeExporter commented 9 years ago
sagiiiltus, I think we fixed the underlying issue: 
https://gerrit.chromium.org/gerrit/#/c/67650/

but I'm having some trouble building the vp8 encoder with Xcode 5. There seem 
to be some issues with asm_offsets. Do you have any trouble building the 
encoder?

I'm working from your patches in #1

Original comment by johannkoenig@google.com on 8 Nov 2013 at 5:58

GoogleCodeExporter commented 9 years ago
Johann, 
Thanks for the fix! I'll check it out. Thanks all the rest for investigating 
this matter!
I had no problems compiling with Xcode5 for iOS (see comment #6 for 
configuration example). I'm guessing your trying to build for OSX, and you 
don't have YASM. It fails when it fallbacks to NASM and tries to compile some 
optimizations, so either disable the optimizations or put yasm in your path 
(you'll probably will have to compile it as well, or use macports).

Original comment by sagiil...@gmail.com on 11 Nov 2013 at 1:10

GoogleCodeExporter commented 9 years ago
I'm also struggling with garbled video at times (testing Johann's last fix 
now). However, I  wanted to post my patch for a cleaner ios7/xcode5 compilation.

The patch uses only xcode-select and xcrun to discover tool paths. So there's 
enough support baked in that hypothetically compiling for the iPhoneSimulator 
should be straightforward if someone wants to go there, for e.g.

I'm getting linker errors with vp9, but since I'm not using it, it's just 
disabled for now.

With this patch, I'm configuring as follows:
# ../foo/configure --target=armv7-darwin-gcc --ios-platform=iPhoneOS 
--enable-vp8 --disable-vp9 --enable-examples --disable-optimizations 
--extra-cflags="-O1 -fno-strict-aliasing"

The optimizations stuff hopefully won't be needed with the above fix.

Original comment by armando....@gmail.com on 11 Nov 2013 at 7:55

Attachments:

GoogleCodeExporter commented 9 years ago
So none of you run into this issue:
[AS] vpx_scale/arm/neon/vp8_vpxyv12_copyframe_func_neon.asm.s.o
Apple Inc version cctools-846.2.4, GNU assembler version 1.38
vpx_scale/arm/neon/vp8_vpxyv12_copyframe_func_neon.asm.s:101:unsupported 
relocation on symbol yv12_buffer_config_y_buffer

It appears that it doesn't like the #include directive, and isn't managing the 
defines the way I want it to. They should be set in vpx_scale_asm_offsets:
.set yv12_buffer_config_y_buffer ,  52
and included in the assembly:
    #include  "vpx_scale_asm_offsets.asm"
and used:
    ldr             r3, [r1, #yv12_buffer_config_y_buffer]        @dstptr1

but neither of you are seeing this issue? I'm trying to find documentation on 
the #include and .set directives in clang

Original comment by johannkoenig@google.com on 12 Nov 2013 at 6:33

GoogleCodeExporter commented 9 years ago
Argh, I'm trying to upstream ads2gas_apple.pl from Chrome at the same time. For 
some reason it converts .include to #include. This might be part of the as -> 
clang + --no-integrated-as change though.

And I'm having some very strange issues with 
vp9/common/arm/neon/vp9_short_idct32x32_1_add_neon.asm

Anyway, thanks for the help. I think I'm headed in the right direction now.

Original comment by johannkoenig@google.com on 12 Nov 2013 at 6:48

GoogleCodeExporter commented 9 years ago
https://gerrit.chromium.org/gerrit/#/c/67802/
https://gerrit.chromium.org/gerrit/#/c/67805/

Now trying to figure out what I can pull from:
https://chromium.googlesource.com/chromium/deps/libvpx/+/master/source/libvpx/bu
ild/make/ads2gas_apple.pl

to keep these in sync. Also looking at armando's patch.

Original comment by johannkoenig@google.com on 12 Nov 2013 at 8:53

GoogleCodeExporter commented 9 years ago
https://gerrit.chromium.org/gerrit/67817
armando.dicianno, if you'd prefer I can set you as the author but it will 
include your email. I changed it a little bit. Are there situations where xcrun 
isn't in $PATH?

Also, it looks like iphonesimulator doesn't work for me. We have some 
arm-specific things gonig on there so it would probably require a bit more 
reworking.

Original comment by johannkoenig@google.com on 13 Nov 2013 at 10:27

GoogleCodeExporter commented 9 years ago
@johann,
Some of those xcrun calls will only be available with XCode 5 installed, not 
sure if that counts as a problem or not.
I saw that your patches landed on the webrtc trunk (I'm seeing libvpx from 
chromium @ 232686), but I was not able to see a change in the problematic 
behavior. Is there more to come here or are you seeing this bug as fixed?

Original comment by char...@tokbox.com on 13 Nov 2013 at 10:38

GoogleCodeExporter commented 9 years ago
@charley if you're trying to use libvpx separately from WebRTC I'd suggest 
tracking libvpx in http://git.chromium.org/webm/libvpx.git

What is checked in to WebRTC is only intended to work in WebRTC. The problem 
there is that they use a different version of clang which, for some reason, 
will not check the include paths (or even find files in the same directory) 
when using .include instead of #include:
http://llvm.org/bugs/show_bug.cgi?id=16022
to cope with that, WebRTC has a different version of 
build/make/ads2gas_apple.pl. I'm working on converging those.

On the other side, the clang in Xcode will not recognize #include, so thus far 
we haven't upstreamed those changes.

Long story short - I'm trying to make the code bases the same but for now 
they're not quite.

Original comment by johannkoenig@google.com on 13 Nov 2013 at 10:45

GoogleCodeExporter commented 9 years ago
And WRT Xcode 5 - I expect most people who build for iOS to run the latest 
version of XCode. Supporting multiple versions can be a huge pain so I've 
generally only gone for the latest.

Original comment by johannkoenig@google.com on 13 Nov 2013 at 10:47

GoogleCodeExporter commented 9 years ago
Ah, good to know, thanks. We are building from WebRTC and seeing this and some 
other crashes on iOS with clang. I'll experiment with building directly from 
the repo you mentioned, and otherwise hold fast for more updates.

Original comment by char...@tokbox.com on 13 Nov 2013 at 10:52

GoogleCodeExporter commented 9 years ago
To allow the Chromium changes and the Xcode conversion to coexist, I've added a 
-chromium option:
https://gerrit.chromium.org/gerrit/67819

This can be paired with a change to chromium libvpx.gyp:
Index: libvpx.gyp
===================================================================
--- libvpx.gyp  (revision 232686)
+++ libvpx.gyp  (working copy)
@@ -223,7 +223,7 @@
               'action': [
                 'bash',
                 '-c',
-                'cat <(RULE_INPUT_PATH) | perl 
<(shared_generated_dir)/<(ads2gas_script) > 
<(shared_generated_dir)/<(RULE_INPUT_ROOT).S',
+                'cat <(RULE_INPUT_PATH) | perl 
<(shared_generated_dir)/<(ads2gas_script) -chromium> 
<(shared_generated_dir)/<(RULE_INPUT_ROOT).S',
               ],
               'process_outputs_as_sources': 1,
               'message': 'Convert libvpx asm file for ARM <(RULE_INPUT_PATH).',
Index: source/libvpx/build/make/ads2gas_apple.pl

I'll submit that once these changes are merged upstream.

Original comment by johannkoenig@google.com on 13 Nov 2013 at 11:50

GoogleCodeExporter commented 9 years ago
Johann,
The effort is much appreciated. Did you manage to solve the compilation errors 
of the assembly code? I didn't experience it probably due to using a forked 
webRTC version, which is a few months old already.. We'll definitely will wait 
for the fixed gyp file. Many thanks again!

Original comment by sagiil...@gmail.com on 15 Nov 2013 at 11:11

GoogleCodeExporter commented 9 years ago
I think so. You mean these errors:
vpx_scale/arm/neon/vp8_vpxyv12_copyframe_func_neon.asm.s:101:unsupported 
relocation on symbol yv12_buffer_config_y_buffer
?

Also, I think the current architecture won't allow for building ios simulator 
code. Since we give the target arm-darwin-gcc we make all sorts of arm-biased 
decisions. if we wanted to build for the simulator, we'd need to change that to 
x86-darwin-gcc (and differentiate that from x86-darwin[9-14]-gcc). Maybe add a 
new ios target - arm-ios-gcc and x86-ios-gcc. I haven't seen any requests for 
this sort of configuration but I'd be happy to help anyone who is interested in 
working on it.

Original comment by johannkoenig@google.com on 15 Nov 2013 at 4:32

GoogleCodeExporter commented 9 years ago
This issue was closed by revision 5d0c33b8e51d.

Original comment by johannkoenig@google.com on 15 Nov 2013 at 5:23

GoogleCodeExporter commented 9 years ago
Fantastic - glad to see this fixed. To answer the earlier question, xcrun 
should definitely be available, if xcode is installed. You could indirect 
things a bit and use `xcode-select --print-path` + /usr/bin, but that's 
probably only useful if you have two XCode's installed (Developer Preview for 
e.g.) Thanks for the shout-out in the ChangeLog, too. :-)

Original comment by armando....@gmail.com on 15 Nov 2013 at 7:03

GoogleCodeExporter commented 9 years ago
For VP8 1.3 the problem seems to have arisen again.
With --extra-cflags="-O3 -fno-strict-aliasing" the output is still garbled in 
iOS, whereas in 1.2 it was OK.

Any idea on what causes this?

Original comment by gbi.linp...@gmail.com on 11 Feb 2014 at 3:26

GoogleCodeExporter commented 9 years ago
Unfortunately currently VP8 on 1.3 is not buildable in a working state on any 
version of Xcode. Building with Xcode 5 produces the black and white squares as 
noted above. With Xcode 4.6 although xcrun exists, the build is broken with 
"xcrun: error: unrecognized option: --show-sdk-path".

Original comment by tor...@vsee.com on 12 Feb 2014 at 6:37

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
The problem is still visible in 1.3.0 with no special build configuration. It 
is even still visible with -fno-strict-aliasing set. It is however not visible 
with -O1 and --disable-optimizations.

Original comment by robert.s...@gmail.com on 21 Feb 2014 at 8:15

GoogleCodeExporter commented 9 years ago
Issue webrtc:2803 has been merged into this issue.

Original comment by fischman@webrtc.org on 25 Feb 2014 at 9:18

GoogleCodeExporter commented 9 years ago
Issue webrtc:2803 has been merged into this issue.

Original comment by fischman@webrtc.org on 25 Feb 2014 at 6:26

GoogleCodeExporter commented 9 years ago
gbi.linphone, torrey@vsee.com, robert.swain,
If any of you have simple repro steps, would you please bisect? Unfortunately I 
don't have a way to test this. I can help direct you through the bisect process 
if you need.

Original comment by johannkoenig@google.com on 25 Feb 2014 at 10:16

GoogleCodeExporter commented 9 years ago
Issue webrtc:2440 has been merged into this issue.

Original comment by fischman@webrtc.org on 25 Feb 2014 at 10:57

GoogleCodeExporter commented 9 years ago
Corruption repros when building AppRTCDemo for Release-iphoneos per 
https://code.google.com/p/webrtc/source/browse/trunk/talk/app/webrtc/objc/README
 with https://webrtc-codereview.appspot.com/8479005/ patched in.
I tried bisecting the version of libvpx used by that webrtc sample app but too 
much seems to have changed - I get either the checkerboard corruption reported 
in this bug or uninformative crashes (probably due to mix/matching new webrtc 
code with old libvpx code).

Johann: is there a reference sample app or test suite for libvpx that's 
runnable on iOS?

Original comment by fischman@chromium.org on 27 Feb 2014 at 3:49

GoogleCodeExporter commented 9 years ago
A couple years ago I had inherited something for local testing but I can't find 
it at the moment.

It's unfortunate (in a sense) that this does not reproduce on desktop or x86 
builds because it would be a lot easier to debug.

Original comment by johannkoenig@google.com on 27 Feb 2014 at 4:15

GoogleCodeExporter commented 9 years ago
Progress!

- Bisected the repro from #40 by applying -O0 (instead of -Os) to clumps of 
files in libvpx.ninja and isolated the problem to vp8/decoder/decodeframe.c; 
everything else can be built with -Os and that one file with -O0 and no 
corruption results.  
- Removing "static" from the linkage of decodeframe.c:decode_macroblock() 
removes the corruption (found by binary searching the effect of removing 
"static" after first moving all static helpers to another compilation unit 
intending to bisect decodeframe.c's contents under different compiler options).
- Comparing the compiled decodeframe.o with decode_macroblock as static and as 
not shows that a big difference is that decode_macroblock is inlined with 
static (giving corrupted images) and not inlined without static (giving good 
images).
- Binary searching by moving the contents of decode_macroblock in/out-of-line 
narrows the problem down to 
https://code.google.com/p/chromium/codesearch#chromium/src/third_party/libvpx/so
urce/libvpx/vp8/decoder/decodeframe.c&l=247.  If I move:
  vp8_dequantize_b(b, xd->dequant_y2);
  vp8_short_inv_walsh4x4(&b->dqcoeff[0], xd->qcoeff);
  vpx_memset(b->qcoeff, 0, 16 * sizeof(b->qcoeff[0]));
into a helper function and force it to not be inlined (by making the helper not 
static, but leaving decode_macroblock static) then the corruption is removed.  
If I allow it to be inlined, then the corruption returns.

Things I tried that don't make a difference:
- s/vpx_memset/memset/ (just in case)
- append _c to vp8_dequantize_b and vp8_short_inv_walsh4x4 to make sure the 
problem isn't hand-crafted NEON ASM failing to maintain clean frames

Johann/Marco: I'm hoping this is enough for you to be able to spot the bug by 
inspection.  

Original comment by fischman@chromium.org on 10 Mar 2014 at 5:17

GoogleCodeExporter commented 9 years ago
Can you send me the good/bad object files? If you're building with 
no-strict-aliasing then this is mostly a shot in the dark, but can you change:
  vp8_short_inv_walsh4x4(&b->dqcoeff[0], xd->qcoeff);
to
  vp8_short_inv_walsh4x4(b->dqcoeff, xd->qcoeff);

Otherwise, I'll try and reproduce your build on my mac and see what kind of 
tweaks i can make to get the obj file like your good one. When I'm back I'll 
have a device.

Original comment by johannkoenig@google.com on 14 Mar 2014 at 3:48

GoogleCodeExporter commented 9 years ago
@johannkoenig: suggestion from #43 didn't make a difference (.o file is 
identical).

Attached are the object files (ORIG & OutOfLine for the original and the 
version with the 3 calls from #42 out-of-lined, resp.) and the out-of-lined 
version of decodeframe.c that yielded the -OutOfLine object file.

Original comment by fischman@chromium.org on 16 Mar 2014 at 5:30

Attachments:

GoogleCodeExporter commented 9 years ago
Proposed hack workaround until we figure out the right thing: 
https://gerrit.chromium.org/gerrit/#/c/69343/

Original comment by fischman@chromium.org on 25 Mar 2014 at 8:52

GoogleCodeExporter commented 9 years ago
I'm not sure this fixes the problem. 
I still see garbles with this fix and "-fno-strict-aliasing". 

My commandline is such for armv7; I verified that your bzero is in my 
decodframe.c:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/u
sr/bin/clang  -arch armv7 -isysroot 
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer
/SDKs/iPhoneOS7.0.sdk -mtune=cortex-a8 -O3 -fno-strict-aliasing -Wall 
-Wdeclaration-after-statement -Wdisabled-optimization -Wpointer-arith 
-Wtype-limits -Wcast-qual -Wvla -Wimplicit-function-declaration -Wuninitialized 
-Wunused-variable -fno-strict-aliasing -Wno-unused-function -I. 
-I"/Users/buildbot/linphone-iphone/submodules/build/..//externals/libvpx" -M 
/Users/buildbot/linphone-iphone/submodules/build/..//externals/libvpx/vp8/decode
r/decodframe.c | sed -e 's;^\([a-zA-Z0-9_]*\)\.o;vp8/decoder/decodframe.c.o 
vp8/decoder/decodframe.c.d;' > vp8/decoder/decodframe.c.d

Note that I'm using libvpx1.3.0 (not the current master, for which your patch 
applies), so my file is decodframe.c instead of decodeframe.c

Original comment by gbi.linp...@gmail.com on 28 Mar 2014 at 12:01

GoogleCodeExporter commented 9 years ago
Can you verify the fix in master?

Original comment by fischman@chromium.org on 28 Mar 2014 at 3:00

GoogleCodeExporter commented 9 years ago
Sadly... no. Because of 
https://code.google.com/p/webm/issues/detail?id=737&colspec=ID%20Pri%20mstone%20
ReleaseBlock%20Type%20Component%20Status%20Owner%20Summary I can't compile 
libvpx without it crashing under the last XCode versions :(

I'll try your suggestion on XCode 5.0 when I get ahold of one.

Original comment by gbi.linp...@gmail.com on 31 Mar 2014 at 9:38

GoogleCodeExporter commented 9 years ago
Ok, compiling with XCode 5.0 and from master (commit 
1bc32afd3cfa7ade99d105700dfebf11f0e86729 ) still shows the bug with this fix 
and only "-fn-strict-aliasing". So the compilation is performed with '-O3'.

Original comment by gbi.linp...@gmail.com on 31 Mar 2014 at 12:15

GoogleCodeExporter commented 9 years ago
Please re-verify with the workaround 
https://code.google.com/p/webrtc/issues/detail?id=3038 comment #23 

Original comment by fischman@chromium.org on 31 Mar 2014 at 5:53