dinhminhquoi / webm

Automatically exported from code.google.com/p/webm
0 stars 0 forks source link

stack overflow in vp9_sub_pixel_variance16x16_sse2 #808

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Since upgrading to libvpx 1.3.0 in the OpenBSD ports tree and thus introducing 
VP9 support I have had a report from a user of a stack overflow being exposed 
on OpenBSD/i386. It would appear to be in the SSE2 code. It sounds as if 
essentially encoding anything as a source to VP9 results in this crash.

I am going to see about getting access to an i386 system to see if I can 
reproduce the issue and if so build with debug symbols to try and provide a 
more complete backtrace.

Means of exposing the issue..

ffmpeg -y \
    -i input.mp4 \
    -c:v libvpx-vp9 \
    -c:a libopus \
    -strict experimental \
    output.webm

(gdb) bt
#0  0x079bd3e1 in kill () at <stdin>:2
#1  0x079f94c1 in __stack_smash_handler (func=0x2db944c0 
"vp9_sub_pixel_variance16x16_sse2", damaged=1760) at 
/usr/src/lib/libc/sys/stack_protector.c:61
#2  0x0dc6589e in vp9_sub_pixel_variance16x16_sse2 () from 
/usr/local/lib/libvpx.so.5.0
#3  0x0dc2c467 in vp9_find_best_sub_pixel_tree () from 
/usr/local/lib/libvpx.so.5.0
#4  0x0dc489c9 in vp9_rd_pick_inter_mode_sb () from /usr/local/lib/libvpx.so.5.0
#5  0x0dc1cb98 in pick_sb_modes () from /usr/local/lib/libvpx.so.5.0
#6  0x0dc1a14c in rd_pick_partition () from /usr/local/lib/libvpx.so.5.0
#7  0x0dc1a868 in rd_pick_partition () from /usr/local/lib/libvpx.so.5.0
#8  0x0dc1a868 in rd_pick_partition () from /usr/local/lib/libvpx.so.5.0
#9  0x0dc16136 in encode_frame_internal () from /usr/local/lib/libvpx.so.5.0
#10 0x0dc147f0 in vp9_encode_frame () from /usr/local/lib/libvpx.so.5.0
#11 0x0dc40107 in encode_frame_to_data_rate () from /usr/local/lib/libvpx.so.5.0
#12 0x0dc3ee17 in vp9_get_compressed_data () from /usr/local/lib/libvpx.so.5.0
#13 0x0dc06895 in vp9e_encode () from /usr/local/lib/libvpx.so.5.0
#14 0x0db81fa3 in vpx_codec_encode () from /usr/local/lib/libvpx.so.5.0
#15 0x0e307342 in vp8_encode () from /usr/local/lib/libavcodec.so.20.0

Original issue reported on code.google.com by brad.ope...@gmail.com on 10 Jun 2014 at 3:20

GoogleCodeExporter commented 9 years ago
I was able to reproduce the crash on another system..

(gdb) bt
#0  0x0fb843e1 in kill () at <stdin>:2
#1  0x0fbc04c1 in __stack_smash_handler (func=0x2817dfb0 
"vp9_sub_pixel_variance16x16_sse2", damaged=1760) at 
/usr/src/lib/libc/sys/stack_protector.c:61
#2  0x0822a9d1 in vp9_sub_pixel_variance16x16_sse2 (
    src=0x88805cbf '\"' <repeats 91 times>, "!!!   \037\037\037\036\036\036", '\035' <repeats 20 times>, "\034\034\034\033\033\033\032\032\032\031\031\031", '\030' <repeats 65 times>..., src_stride=1760, 
    x_offset=8, y_offset=0, 
    dst=0x8e36ecc0 "\037\035\034\"%&! $!\030\030\030\"+#\031\031\034!\032\027\030\034\036\"\037\034\037\034\031\027\034\023\027\036!\036\033\032\032\035 \"! \037\037  !\"(\"\037\"\035\033\033 \035\034\035 \034\" \032\031\032\027\032\035\036\031\034\"$ \030\036\034\036 ! \037\036!'$\031\035\035\036\036\035\036\034 \"$&\037\034 \037\034\030\031\034\037\030\032\034\034\034\034\034\033$ \034\032\027\023\031\"\036\033\030\027\027\021\024\023\026\031\026\021\025\026\030\024\024\023\023\025\024\032\034\031\024\020\026\035\031\017\016\021\024\033\031\025\025\027\023\025\027\027\030\032\031\031\023\025\026\024\023\025\026\030\031\032\032\027\025\032\032\026\022\016\027\037\030\025\022\023\026\030"..., dst_stride=1760, sse_ptr=0xcfbc4a04) at vp9/encoder/x86/vp9_variance_sse2.c:416
#3  0x081f8b11 in vp9_find_best_sub_pixel_tree (x=0x809de020, bestmv=<optimized 
out>, ref_mv=<optimized out>, allow_hp=0, error_per_bit=<optimized out>, 
vfp=<optimized out>, forced_stop=<optimized out>, 
    iters_per_step=<optimized out>, mvjcost=<optimized out>, mvcost=0xcfbc4a88, distortion=<optimized out>, sse1=<optimized out>) at vp9/encoder/vp9_mcomp.c:422
#4  0x0821c287 in single_motion_search (cpi=<optimized out>, x=<optimized out>, 
tile=<optimized out>, bsize=<optimized out>, mi_row=<optimized out>, 
mi_col=<optimized out>, tmp_mv=<optimized out>, 
    rate_mv=<optimized out>) at vp9/encoder/vp9_rdopt.c:2478
#5  0x08213bba in handle_inter_mode (cpi=<optimized out>, x=<optimized out>, 
tile=<optimized out>, bsize=<optimized out>, txfm_cache=<optimized out>, 
rate2=<optimized out>, distortion=<optimized out>, 
    skippable=<optimized out>, rate_y=<optimized out>, distortion_y=<optimized out>, rate_uv=<optimized out>, distortion_uv=<optimized out>, mode_excluded=<optimized out>, disable_skip=<optimized out>, 
    best_filter=<optimized out>, mode_mv=<optimized out>, mi_row=<optimized out>, mi_col=<optimized out>, single_newmv=<optimized out>, psse=<optimized out>, ref_best_rd=<optimized out>)
    at vp9/encoder/vp9_rdopt.c:2703
#6  0x0821139c in vp9_rd_pick_inter_mode_sb (cpi=<optimized out>, x=0x0, 
tile=<optimized out>, mi_row=<optimized out>, mi_col=<optimized out>, 
returnrate=<optimized out>, returndistortion=0x9, 
    bsize=BLOCK_8X8, ctx=0x9, best_rd_so_far=-3477763796385366704) at vp9/encoder/vp9_rdopt.c:3496
#7  0x081ea96a in pick_sb_modes (cpi=<optimized out>, tile=<optimized out>, 
mi_row=<optimized out>, mi_col=<optimized out>, totalrate=<optimized out>, 
totaldist=<optimized out>, bsize=<optimized out>, 
    ctx=<optimized out>, best_rd=<optimized out>) at vp9/encoder/vp9_encodeframe.c:666
#8  0x081e8bbe in rd_pick_partition (cpi=<optimized out>, tile=<optimized out>, 
tp=<optimized out>, mi_row=<optimized out>, mi_col=<optimized out>, 
bsize=BLOCK_16X16, rate=<optimized out>, 
    dist=<optimized out>, do_recon=<optimized out>, best_rd=-8198984815251842476) at vp9/encoder/vp9_encodeframe.c:1557
#9  0x081e90c8 in rd_pick_partition (cpi=<optimized out>, tile=<optimized out>, 
tp=<optimized out>, mi_row=<optimized out>, mi_col=<optimized out>, 
bsize=<optimized out>, rate=<optimized out>, 
    dist=<optimized out>, do_recon=672635476, best_rd=-9174303791466964396) at vp9/encoder/vp9_encodeframe.c:1611
#10 0x081e90c8 in rd_pick_partition (cpi=<optimized out>, tile=<optimized out>, 
tp=<optimized out>, mi_row=<optimized out>, mi_col=<optimized out>, 
bsize=<optimized out>, rate=<optimized out>, 
    dist=<optimized out>, do_recon=-2135764730, best_rd=-9174303789980397306) at vp9/encoder/vp9_encodeframe.c:1611
#11 0x081e636f in encode_sb_row (cpi=<optimized out>, tile=<optimized out>, 
mi_row=<optimized out>, tp=<optimized out>) at 
vp9/encoder/vp9_encodeframe.c:1855
#12 0x081e54e4 in encode_frame_internal (cpi=<optimized out>) at 
vp9/encoder/vp9_encodeframe.c:2012
#13 0x081e4d21 in vp9_encode_frame (cpi=0x809d6020) at 
vp9/encoder/vp9_encodeframe.c:2284
#14 0x0820a55e in encode_frame_to_data_rate (cpi=0x0, size=0x0, dest=<optimized 
out>, frame_flags=<optimized out>) at vp9/encoder/vp9_onyx_if.c:3195
#15 0x08209e26 in vp9_get_compressed_data (ptr=0x809d6020, 
frame_flags=<optimized out>, size=<optimized out>, dest=<optimized out>, 
time_stamp=<optimized out>, time_end=<optimized out>, 
    flush=<optimized out>) at vp9/encoder/vp9_onyx_if.c:3985
#16 0x081da635 in vp9e_encode (ctx=0x7c14a000, img=<optimized out>, 
pts=-3477755412609202200, duration=<optimized out>, flags=<optimized out>, 
deadline=<optimized out>) at vp9/vp9_cx_iface.c:757
#17 0x0816ace3 in vpx_codec_encode (ctx=0x7a0c9804, img=<optimized out>, 
pts=25, duration=<optimized out>, flags=<optimized out>, deadline=<optimized 
out>) at vpx/src/vpx_encoder.c:226
#18 0x00822bd4 in vp8_encode (avctx=0x83b7ac00, pkt=0xcfbc8be8, 
frame=0x7bcb7c00, got_packet=0xcfbc8c30) at libavcodec/libvpxenc.c:708
#19 0x00985da5 in avcodec_encode_video2 (avctx=0x83b7ac00, avpkt=0xcfbc8be8, 
frame=0xcfbc8c30, got_packet_ptr=0x246cbd2c) at libavcodec/utils.c:1905
#20 0x15302936 in do_video_out (in_picture=<optimized out>, ost=<optimized 
out>, s=0x79bf1800) at ffmpeg.c:997
#21 reap_filters () at ffmpeg.c:1157
#22 0x15305d11 in transcode_step () at ffmpeg.c:3398
#23 0x153084ae in transcode () at ffmpeg.c:3441
#24 0x1530894c in main (argc=11, argv=0xcfbcd9e8) at ffmpeg.c:3621

Original comment by brad.ope...@gmail.com on 11 Jun 2014 at 8:48

GoogleCodeExporter commented 9 years ago
Any chance of having this looked into? It would be nice to have a VP9 encoder 
that does not crash.

Original comment by brad.ope...@gmail.com on 2 Jul 2014 at 8:23

GoogleCodeExporter commented 9 years ago
Were you able to reproduce with vpxenc? Can you share input.mp4? Does it crash 
in gtest:
./test_libvpx --gtest_filter=\*VP9SubpelVariance\*

Original comment by johannkoenig@google.com on 15 Jul 2014 at 6:36

GoogleCodeExporter commented 9 years ago
Yes.

$ vpxenc test.yuv -o test.webm --codec=vp9 --width=256 --height=240
Pass 1/2 frame  136/137    20824B    1224b/f   36748b/s 1062586 us (127.99 fps)
Pass 2/2 frame   26/1       2788B 2963394 us 8.77 fps [ETA  0:06:41] Abort trap 
(core dumped)

(gdb) bt full
#0  0x0e2623e1 in kill () at <stdin>:2
No locals.
#1  0x0e29e4c1 in __stack_smash_handler (func=0x28870fb0 
"vp9_sub_pixel_variance16x16_sse2", damaged=576) at 
/usr/src/lib/libc/sys/stack_protector.c:61
        sdata = {log_file = -1, connected = 0, opened = 1, log_stat = 0, log_tag = 0x365f5e40 "vpxenc", log_fac = 8, log_mask = 255}
        message = "stack overflow in function %s"
        sa = {__sigaction_u = {__sa_handler = 0, __sa_sigaction = 0}, sa_mask = 0, sa_flags = 0}
        mask = 4294967263
#2  0x0891d9d1 in vp9_sub_pixel_variance16x16_sse2 (
    src=0x77fdc73f "\230\230\231\232\234\235\236\236\236\236\236\236\237?????????????????", '?' <repeats 21 times>, "?????????????????????????????????????\222\206{ma[VROQQRSUVWWUSPKFA><86420/--002479:<652114797?IPV^hp\214\222\234?????????????????????", '?' <repeats 16 times>, "???????????????"..., src_stride=576, x_offset=8, y_offset=0, 
    dst=0x842b98c0 "\200\201\202\204\205\206\207\211~~~~\204\214\227\237??????\237\236???????????????????????????????????????????????????????????????????????????????????????????????????\237\236??????????????\236\232\227\222\217\216\215\213\212\211\210\206\205\217\217\217\217\217\217\217\217\222\217\213\210\207\210\211\211\200\202\205\207\213\215\220\222\223\224\225\227\230\231\232\234??????????????\236\234"..., dst_stride=576, sse_ptr=Variable "sse_ptr" is not available.
) at vp9/encoder/x86/vp9_variance_sse2.c:416
        sse = 93935
        sse = 93935
        sse = 93935
        se = Variable "se" is not available.

The input does not matter. I can reproduce if I use an AVI with XVid, an MP4 
with H.264 or WebM with VP8 or anything else.

Also when trying an H.264 file as input to FFmpeg I noticed a crash in 
vp9_sub_pixel_variance4x4_sse().

(gdb) bt full
#0  0x08a333e1 in kill () at <stdin>:2
No locals.
#1  0x08a6f4c1 in __stack_smash_handler (func=0x2c532080 
"vp9_sub_pixel_variance4x4_sse", damaged=800) at 
/usr/src/lib/libc/sys/stack_protector.c:61
        sdata = {log_file = -1, connected = 0, opened = 1, log_stat = 0, log_tag = 0x36411620 "ffmpeg", log_fac = 8, log_mask = 255}
        message = "stack overflow in function %s"
        sa = {__sigaction_u = {__sa_handler = 0, __sa_sigaction = 0}, sa_mask = 0, sa_flags = 0}
        mask = 4294967263
#2  0x0c5ded91 in vp9_sub_pixel_variance4x4_sse (src=0x7e2fa55f '\020' <repeats 
200 times>..., src_stride=800, x_offset=8, y_offset=0, 
    dst=0x772b1560 '\024' <repeats 14 times>, "\022\022", '\023' <repeats 28 times>, '\024' <repeats 13 times>, "\023\022\021", '\020' <repeats 36 times>, "\022\022\022\022\022\022\022\022\023\023\023\023", '\024' <repeats 44 times>, '\023' <repeats 20 times>, '\024' <repeats 28 times>..., dst_stride=800, sse_ptr=Variable "sse_ptr" is not available.
) at vp9/encoder/x86/vp9_variance_sse2.c:416
        sse = 256
        sse = 256
        sse = 256
        se = Variable "se" is not available.

I'll get back to you regarding the use of test_libvpx.

Original comment by brad.ope...@gmail.com on 18 Jul 2014 at 1:02

GoogleCodeExporter commented 9 years ago
$ ./test_libvpx --gtest_filter=\*VP9SubpelVariance\*
Note: Google Test filter = *VP9SubpelVariance*:-SSSE3/*:-SSE4_1/*:-AVX/*:-AVX2/*
[==========] Running 0 tests from 0 test cases.
[==========] 0 tests from 0 test cases ran. (1 ms total)
[  PASSED  ] 0 tests.

Original comment by brad.ope...@gmail.com on 18 Jul 2014 at 2:58

GoogleCodeExporter commented 9 years ago
Does this occur in git? I've tried with a couple files but have been unable to 
repro. Is this with a specific test.yuv or all input? Compiler version?

Original comment by johannkoenig@google.com on 18 Jul 2014 at 9:00

GoogleCodeExporter commented 9 years ago
I haven't been able to reproduce this under emulation, but this seems to be 
related:
https://gerrit.chromium.org/gerrit/#/c/70940/

Try this patch to see if it helps, if you get a chance.

Original comment by jz...@google.com on 25 Jul 2014 at 2:38

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I can't build git due to an unresolved issue with Clang + our linker with C++ 
code.

AFAIK this is with all input.

The -mstackrealign patch did not help at all.

Original comment by brad.ope...@gmail.com on 26 Jul 2014 at 10:08

GoogleCodeExporter commented 9 years ago
Can you try 'configure ... --extra-cflags=-mstackrealign'? It seems this flag 
should be used more broadly. I have my doubts though, as the default alignment 
should be 16-bytes in most cases.
What version of clang are you using? tip of tree or is there a version in ports?

Original comment by jz...@chromium.org on 27 Jul 2014 at 3:27

GoogleCodeExporter commented 9 years ago
I will try that.

I'm using the LLVM we have in our ports tree (ports/devel/llvm) which is a 3.5 
snapshot.

$ pkg_info llvm
Information for inst:llvm-3.5.20140228p8

Original comment by brad.ope...@gmail.com on 27 Jul 2014 at 3:31

GoogleCodeExporter commented 9 years ago
Using '--extra-cflags=-mstackrealign' did not help either.

Original comment by brad.ope...@gmail.com on 27 Jul 2014 at 11:33

GoogleCodeExporter commented 9 years ago
I checked out the latest code and was able to build the code on amd64. I 
*think* the issue was having the unit test framework enabled; but I don't need 
that to do the testing I was doing. When I get access to the i386 system I was 
testing on I'll try building the code and see if I can reproduce the crashing 
with the latest code.

Original comment by brad.ope...@gmail.com on 15 Aug 2014 at 4:23

GoogleCodeExporter commented 9 years ago
Still broken with latest git code.

Original comment by brad.ope...@gmail.com on 14 Sep 2014 at 12:22

GoogleCodeExporter commented 9 years ago

Original comment by renganat...@google.com on 9 Oct 2014 at 10:19

GoogleCodeExporter commented 9 years ago
punt it from the next release if it's still not possible to repro.

Original comment by johannkoenig@google.com on 6 Nov 2014 at 1:00

GoogleCodeExporter commented 9 years ago
Using the latest code as of yesterday this issue appears to be resolved. Is 
there going to be a release anytime soon-ish so I have something to use for our 
port/package that is not broken?

Original comment by brad.ope...@gmail.com on 6 Dec 2014 at 3:33

GoogleCodeExporter commented 9 years ago
It's good that it's working, but I'm not sure if there was a fix put in 
necessarily. From your earlier comment (#14 / 9/13) you may have already picked 
up the last change (asm -> intrinsics) which was merged on 9/7 
(v1.3.0-4199-gbb4f2eb).

Johann was working on the Hotlist (IndianRunnerDuck) for 1.4.0, though that may 
slip past the end of the year.

Original comment by jz...@google.com on 6 Dec 2014 at 5:06

GoogleCodeExporter commented 9 years ago
Whether a fix was put in specifically intended for this bug or not really isn't 
the issue. It's just that something went in that ultimately results in the 
issue going away. Back in 9/13 the code definitely didn't work as is from the 
repo. I'd hope there is a new release before the end of January.

Original comment by brad.ope...@gmail.com on 7 Dec 2014 at 6:53

GoogleCodeExporter commented 9 years ago
If you happen to be bored and would be willing to git bisect to get an idea of 
what fixed this, I would really appreciate it. Is it possible that an updated 
toolchain fixed the issue? Happy to hear it's not happening anymore.

Original comment by johannkoenig@google.com on 8 Dec 2014 at 10:48

GoogleCodeExporter commented 9 years ago
No, nothing has changed with the toolchain. If anything has fixed this it is 
within the libvpx codebase. The 1.3.0 code built with the same compiler is just 
as broken now as it was when I filed the bug report and building the latest 
code with the same compiler is working fine.

Original comment by brad.ope...@gmail.com on 9 Dec 2014 at 6:42

GoogleCodeExporter commented 9 years ago
On my machine, the problem sill occurs. However it's not Abort trap, like with 
1.3.0 source code, but with libvpx compiled from commit ccffe318ffc, ffmpeg 
segfaults with Illegal instruction.

Original comment by mikolaj....@gmail.com on 10 Jan 2015 at 3:41

GoogleCodeExporter commented 9 years ago
Also, above behavior changes in libvpx (SIGABRT vs SIGILL) with commit 
9f268611472

Original comment by mikolaj....@gmail.com on 10 Jan 2015 at 11:31

GoogleCodeExporter commented 9 years ago
Ok, sorry. SIGILL is only on my i386 KVM-based virtual machine. When I copy 
ffmpeg and libvpx to Atom based i386 machine I still get SIGABRT.

This is with libvpx from commit ccffe318ffc90ae584c33e254b3a35e9142ecc20

(gdb) bt
#0  0x07224481 in kill () at <stdin>:2
#1  0x07233b2c in __stack_smash_handler (func=0x2c3b5470 
"vp9_sub_pixel_variance16x16_ssse3", damaged=1760)
    at /usr/src/lib/libc/sys/stack_protector.c:61
#2  0x0c4ac13e in vp9_sub_pixel_variance16x16_ssse3 () from 
/usr/local/lib/libvpx.so.5.0
#3  0x0c46e930 in vp9_find_best_sub_pixel_tree () from 
/usr/local/lib/libvpx.so.5.0
#4  0x0c487486 in vp9_rd_pick_inter_mode_sb () from /usr/local/lib/libvpx.so.5.0
#5  0x0c44640c in rd_pick_sb_modes () from /usr/local/lib/libvpx.so.5.0
#6  0x0c44467a in rd_pick_partition () from /usr/local/lib/libvpx.so.5.0
#7  0x0c444c7a in rd_pick_partition () from /usr/local/lib/libvpx.so.5.0
#8  0x0c444c7a in rd_pick_partition () from /usr/local/lib/libvpx.so.5.0
#9  0x0c43e797 in vp9_encode_tile () from /usr/local/lib/libvpx.so.5.0
#10 0x0c44098e in encode_frame_internal () from /usr/local/lib/libvpx.so.5.0
#11 0x0c43fade in vp9_encode_frame () from /usr/local/lib/libvpx.so.5.0
#12 0x0c47aa81 in encode_frame_to_data_rate () from /usr/local/lib/libvpx.so.5.0
#13 0x0c478ba9 in vp9_get_compressed_data () from /usr/local/lib/libvpx.so.5.0
#14 0x0c42f045 in encoder_encode () from /usr/local/lib/libvpx.so.5.0
#15 0x0c39df16 in vpx_codec_encode () from /usr/local/lib/libvpx.so.5.0
#16 0x030ab002 in vp8_encode () from /usr/local/lib/libavcodec.so.21.1

Original comment by mikolaj....@gmail.com on 10 Jan 2015 at 6:08

GoogleCodeExporter commented 9 years ago
SIGILL:

(gdb) bt
#0  0x07f2feea in vp9_get32x32var_avx2 (
    src_ptr=0x872c5cc0 "\026\033#&$\037!%\033\030\031\036**\036\030\034  \036\034\031\026\030\037\"\"\037\037!\037\037##\035\030\032\030\024\037)\034\0
26\037\021\016\016\025 \037#$#\031\024\027\035\035\034\035 
$))!\033\027\030\033\034\033 !\030\027\031\033%\036\030$%\034\031\032\034\035 
&#\035\037!\03
7\031\035\"\034\031\034\037\"$(( \" 
\032\027\031\032\036\037\036\031\032\035\036\034\032\026\016\021\030!#\034\026\0
26\025\030\027\026\025\021\024\032 
\037\026\021\022\026\026\032\034\026\026\024\024\032\031\032\034\035\036\026\021
\021\022\021\024\033\031\024\023\025\017\021\030\027\031\030\030\027\02
5\027\035!\027\022\026\026\024\026\033\031\030\027\026\026\032\030\024\025\032\0
22\017\025"..., source_stride=1760, ref_ptr=0x27e38a9c '\200' <repeats 
64 times>, recon_stride=0, SSE=0xcfbe7390)
    at vp9/encoder/x86/vp9_variance_impl_intrin_avx2.c:129
#1  0x07f5f47e in vp9_variance64x64_avx2 (src=Variable "src" is not available.
) at vp9/encoder/x86/vp9_variance_avx2.c:57
#2  0x07ece00c in rd_pick_sb_modes (cpi=0xcfbe7390, tile_data=Variable 
"tile_data" is not available.
) at vp9/encoder/vp9_encodeframe.c:112
#3  0x07ecc6ba in rd_pick_partition (cpi=Variable "cpi" is not available.
) at vp9/encoder/vp9_encodeframe.c:2285
#4  0x07ec676b in vp9_encode_tile (cpi=0x27e310e0, td=0x76665020, 
tile_row=Variable "tile_row" is not available.
) at vp9/encoder/vp9_encodeframe.c:2625
#5  0x07ec89be in encode_frame_internal (cpi=Variable "cpi" is not available.
) at vp9/encoder/vp9_encodeframe.c:3580
#6  0x07ec7b0e in vp9_encode_frame (cpi=0x76659020) at 
vp9/encoder/vp9_encodeframe.c:3790
#7  0x07f02ab1 in encode_frame_to_data_rate (cpi=0xcfbe7390, size=0x0) at 
vp9/encoder/vp9_encoder.c:2778
#8  0x07f00bd9 in vp9_get_compressed_data (cpi=0x76659020, size=0xcfbe7c54, 
time_stamp=Variable "time_stamp" is not available.
) at vp9/encoder/vp9_encoder.c:3358
#9  0x07eb7015 in encoder_encode (ctx=0x7ce13008, img=Variable "img" is not 
available.
) at vp9/vp9_cx_iface.c:981
#10 0x07e25f26 in vpx_codec_encode (ctx=0x7fea9204, img=0x7fea9220, pts=24, 
flags=-809602160, deadline=Variable "deadline" is not available.
) at vpx/src/vpx_encoder.c:223
#11 0x0fcc1002 in vp8_encode (avctx=0x77a40400, pkt=0xcfbe80e0, 
frame=0x7ba48c00, got_packet=0xcfbe8128) at libavcodec/libvpxenc.c:724
#12 0x0fe6447f in avcodec_encode_video2 (avctx=0x77a40400, avpkt=0xcfbe80e0, 
frame=0xcfbe8128, got_packet_ptr=0x23941b6c)
    at libavcodec/utils.c:2034
#13 0x149d190b in reap_filters () at ffmpeg.c:1058
#14 0x149d2c7e in transcode_step () at ffmpeg.c:3637
#15 0x149d76cb in transcode () at ffmpeg.c:3680
#16 0x149d7d80 in main (argc=345728969, argv=0xcfbecf94) at ffmpeg.c:3856

Original comment by mikolaj....@gmail.com on 10 Jan 2015 at 6:26

GoogleCodeExporter commented 9 years ago
> SIGILL:
> 
> (gdb) bt
> #0  0x07f2feea in vp9_get32x32var_avx2 (

Where was this? If it isn't recent, I'd guess the cpu-detection is broken.

Original comment by jz...@google.com on 13 Jan 2015 at 12:50

GoogleCodeExporter commented 9 years ago
> Ok, sorry. SIGILL is only on my i386 KVM-based virtual machine. When I copy
> ffmpeg and libvpx to Atom based i386 machine I still get SIGABRT.
> 
> This is with libvpx from commit ccffe318ffc90ae584c33e254b3a35e9142ecc20
> 

Thanks. So we're still talking openbsd here right? Can you give the model
number of the processor for reference?
I'm wondering if any 32-bit install (outside of a vm) can reproduce this
regardless of the processor.

> (gdb) bt
> #0  0x07224481 in kill () at <stdin>:2

Original comment by jz...@google.com on 13 Jan 2015 at 12:55

GoogleCodeExporter commented 9 years ago
Whether he is using a VM or not is irrelevant. The kernel doesn't support AVX 
so libvpx shouldn't be running AVX code. Either way it looks like a bug in the 
CPU / AVX detection code.

Original comment by brad.ope...@gmail.com on 14 Jan 2015 at 12:46

GoogleCodeExporter commented 9 years ago
SIGABRT is from OpenBSD/i386 running on:

cpu0: Intel(R) Atom(TM) CPU N270 @ 1.60GHz ("GenuineIntel" 686-class) 1.61 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,CFLUSH,DS,ACPI
,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,SSE3,DTES64,MWAIT,DS-CPL,EST,TM2,SSSE3,xTPR
,PDCM,MOVBE,LAHF,PERF
cpu0: apic clock running at 133MHz
cpu0: mwait min=64, max=64, C-substates=0.2.2.0.2, IBE
cpu0: Enhanced SpeedStep 1600 MHz: speeds: 1600, 1333, 1067, 800 MHz

SIGILL is from OpenBSD/i386 running on:

cpu0: QEMU Virtual CPU version 1.5.3, 2600.65 MHz
cpu0: 
FPU,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,F
XSR,SSE,SSE2,SSE3,CX16,HV,NXE,LONG,LAHF,ABM,SSE4A
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: apic clock running at 1000MHz

Original comment by mikolaj....@gmail.com on 14 Jan 2015 at 12:57

GoogleCodeExporter commented 9 years ago
> Whether he is using a VM or not is irrelevant. The kernel doesn't support AVX
> so libvpx shouldn't be running AVX code. Either way it looks like a bug in
> the CPU / AVX detection code.

Correct. It seems like there were 2 issues here. I was referring to the SIGABRT 
on i386, unless I misread.

Original comment by jz...@google.com on 14 Jan 2015 at 1:00

GoogleCodeExporter commented 9 years ago
And for the SIGABRT I thought it might have been related to the earlier sse2 
problems, but again I might have been scanning through too quickly.

Original comment by jz...@google.com on 14 Jan 2015 at 1:01

GoogleCodeExporter commented 9 years ago
Sorry, and I've copied output from wrong VM.

SIGILL is from OpenBSD/i386 running on:

cpu0: QEMU Virtual CPU version 1.5.3 ("AuthenticAMD" 686-class, 512KB L2 cache) 
2.61 GHz
cpu0: 
FPU,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,F
XSR,SSE,SSE2,NXE,LONG,SSE3,CX16,LAHF,ABM,SSE4A,PERF

Original comment by mikolaj....@gmail.com on 14 Jan 2015 at 1:03

GoogleCodeExporter commented 9 years ago
That is even worse since that CPU does not even support AVX.

Yes, I had the feeling from looking at what was going on that there was 
potentially more than one bug in play here.

Original comment by brad.ope...@gmail.com on 14 Jan 2015 at 1:32

GoogleCodeExporter commented 9 years ago
With the SIGILL via AVX, how was the library configured? If you cross-compiled 
and disabled runtime cpu-detect this could occur.

Original comment by jz...@google.com on 14 Jan 2015 at 2:28

GoogleCodeExporter commented 9 years ago
libvpx was not cross-compiled. It was built from the ports tree via:

http://cvsweb.openbsd.org/cgi-bin/cvsweb/ports/multimedia/libvpx/Makefile?rev=1.
17

If you need any specific logs I can provide them.

Original comment by mikolaj....@gmail.com on 14 Jan 2015 at 2:51

GoogleCodeExporter commented 9 years ago
RE the AVX runtime detection issue. 

This may be happening because libvpx does not check that the OS has enabled 
support for a given ISA. For most OSes this just isn't needed as long as you 
check the CPU feature flags. Hoewever, this seems to be a special case. 

I can't easily repro your experimental setup for the AVX exception so pls try 
the patch below. If it resolves the issue I'll submit a CL.

--

diff --git a/vpx_ports/x86.h b/vpx_ports/x86.h
index 81c2b8b..4e1b9e2 100644
--- a/vpx_ports/x86.h
+++ b/vpx_ports/x86.h
@@ -116,6 +116,18 @@
 #define BIT(n) (1<<n)
 #endif

+static int check_xcr0_ymm()
+  {
+  int xcr0;
+  #if defined(_MSC_VER)
+      xcr0 = (int)_xgetbv(0); /* min VS2010 SP1 compiler is required */
+  #else
+      __asm__ ("xgetbv" : "=a" (xcr0) : "c" (0) : "%edx" );
+  #endif
+
+  return ((xcr0 & 6) == 6); /* checking if xmm and ymm state are enabled in 
XCR0 */
+  }
+
 static INLINE int
 x86_simd_caps(void) {
   unsigned int flags = 0;
@@ -156,14 +168,14 @@

   if (reg_ecx & BIT(19)) flags |= HAS_SSE4_1;

-  if (reg_ecx & BIT(28)) flags |= HAS_AVX;
+  if (reg_ecx & BIT(28) & check_xcr0_ymm()) flags |= HAS_AVX;

   /* Get the leaf 7 feature flags. Needed to check for AVX2 support */
   reg_eax = 7;
   reg_ecx = 0;
   cpuid(7, 0, reg_eax, reg_ebx, reg_ecx, reg_edx);

-  if (reg_ebx & BIT(5)) flags |= HAS_AVX2;
+  if (reg_ebx & BIT(5) & check_xcr0_ymm()) flags |= HAS_AVX2;

   return flags & mask;
 }

Original comment by erik.a.n...@gmail.com on 14 Jan 2015 at 11:05

GoogleCodeExporter commented 9 years ago
I wondered if that was it, the related bug [1]. Your ifdefs could use some 
tightening, I think libwebp has this right.

[1] https://code.google.com/p/webm/issues/detail?id=790

Original comment by jz...@google.com on 14 Jan 2015 at 11:29

GoogleCodeExporter commented 9 years ago
as for the stack smashing, trying configuring with --disable-use-x86inc with 
32-bit + pic targets. OS X already has something similar, but no one has looked 
into the cause.

Original comment by jz...@google.com on 14 Jan 2015 at 11:32

GoogleCodeExporter commented 9 years ago
> This may be happening because libvpx does not check that the OS has enabled 
support for a given ISA.

That's a pretty poor assumption to make. Other major projects I've looked at do 
so such as FFmpeg / x264 / x265, etc. This should be common sense.

> For most OSes this just isn't needed as long as you check the CPU feature 
flags.

That is not true. That assumes that all kernels have AVX support which is 
definitely not true.

Original comment by brad.ope...@gmail.com on 15 Jan 2015 at 12:55

GoogleCodeExporter commented 9 years ago
> [1] https://code.google.com/p/webm/issues/detail?id=790

That seems to be the issue with the SIGILL.

Original comment by brad.ope...@gmail.com on 15 Jan 2015 at 12:57

GoogleCodeExporter commented 9 years ago
With patch applied OpenBSD/i386 on KVM virtual machine no longer fails with 
SIGILL, but now it fails with SIGABRT:

(gdb) bt
#0  0x01393601 in kill () at <stdin>:2
#1  0x013a2cac in __stack_smash_handler (func=0x26b4c2a0 
"vp9_sub_pixel_variance16x16_sse2", damaged=1760)
    at /usr/src/lib/libc/sys/stack_protector.c:61
#2  0x06c4297e in vp9_sub_pixel_variance16x16_sse2 (
    src=0x91e37cbf '!' <repeats 62 times>, "       ", '\037' <repeats 23 times>, "\036\036\036\036\036\035\035\035\035\035", '\034' <repeats 21 times>, "\033\033\032\032\031\031\030\030\027\027\026\026", '\025' <repeats 34 times>, "\026\026\026\026\026\026\026\026\027\027\027\027\027\027", '\030' <repeats 17 times>..., src_stride=1760, x_offset=8, y_offset=0,
    dst=0x8c608cc0 "\036\"\"\036!\035\032\035\"\036\032\034!\037\036\033\035\037\036 \033\027\033\037\037\034\"!\037\034\034\036\033\036&\"\037\037\036\032\022\033&#\035\035\033\031\031\032\034\036\036\036\036\036\036\037\036\035\037\037\033\034\033! \033\026\022\022\026\033  \037 \" \033\032\035!\"\035\035\037\035!\" !! \037\036\035\036\037\036\034\034\035\036\037\037\032\034!!\037\036\033\034\"'##\" \035\033\032\036\031\032\031\031\030\031\031\032\033\034\033\033\034\036\032\025\r\024\035\035\027\026\032\024\035\036\033\030\026\026\024\023\035\034\032\031\030\031\031\025\025\025\026\026\027\030\027\024\022\023\r\v\027\026\025\024\023\025\030\035\030\025\022\024\024\024\024\025\025\025\026\030\031\031\031\032"..., dst_stride=1760, sse_ptr=Variable "sse_ptr" is not available.
)
    at vp9/encoder/x86/vp9_variance_sse2.c:388
#3  0x06c05b80 in vp9_find_best_sub_pixel_tree (x=0x83412020, 
bestmv=0x3f4ff69c, ref_mv=0x9643f4f, allow_hp=0, forced_stop=Variable 
"forced_stop" is not available.
)
    at vp9/encoder/vp9_mcomp.c:688
#4  0x06c1e6b6 in vp9_rd_pick_inter_mode_sb (cpi=Variable "cpi" is not 
available.
) at vp9/encoder/vp9_rdopt.c:2145
#5  0x06bdd63c in rd_pick_sb_modes (cpi=0x0, tile_data=0x0, x=0x83434ff0, 
rd_cost=0xcfbf2e18, bsize=BLOCK_4X4, ctx=0x7d4863b8, best_rd=46017059)
    at vp9/encoder/vp9_encodeframe.c:976
#6  0x06bdb8aa in rd_pick_partition (cpi=Variable "cpi" is not available.
) at vp9/encoder/vp9_encodeframe.c:2285
#7  0x06bdbeaa in rd_pick_partition (cpi=Variable "cpi" is not available.
) at vp9/encoder/vp9_encodeframe.c:2401
#8  0x06bdbeaa in rd_pick_partition (cpi=Variable "cpi" is not available.
) at vp9/encoder/vp9_encodeframe.c:2401
#9  0x06bd596b in vp9_encode_tile (cpi=0x26b400f2, td=0x83412020, 
tile_row=Variable "tile_row" is not available.
) at vp9/encoder/vp9_encodeframe.c:2625
#10 0x06bd7bae in encode_frame_internal (cpi=Variable "cpi" is not available.
) at vp9/encoder/vp9_encodeframe.c:3580
#11 0x06bd6d10 in vp9_encode_frame (cpi=0x83406020) at 
vp9/encoder/vp9_encodeframe.c:3790
#12 0x06c11cc1 in encode_frame_to_data_rate (cpi=0x0, size=0x0) at 
vp9/encoder/vp9_encoder.c:2779
#13 0x06c0fde9 in vp9_get_compressed_data (cpi=0x83406020, size=0xcfbf3874, 
time_stamp=Variable "time_stamp" is not available.
) at vp9/encoder/vp9_encoder.c:3359
#14 0x06bc60f5 in encoder_encode (ctx=0x80caa008, img=Variable "img" is not 
available.
) at vp9/vp9_cx_iface.c:984
#15 0x06b34f36 in vpx_codec_encode (ctx=0x787db204, img=0x787db220, pts=24, 
flags=0, deadline=Variable "deadline" is not available.
) at vpx/src/vpx_encoder.c:223
#16 0x065c5002 in vp8_encode (avctx=0x874ad400, pkt=0xcfbf3d00, 
frame=0x7ae3da00, got_packet=0xcfbf3d48) at libavcodec/libvpxenc.c:724
#17 0x0676847f in avcodec_encode_video2 (avctx=0x874ad400, avpkt=0xcfbf3d00, 
frame=0xcfbf3d48, got_packet_ptr=0x2e65cb6c) at libavcodec/utils.c:2034
#18 0x178b790b in reap_filters () at ffmpeg.c:1058
#19 0x178b8c7e in transcode_step () at ffmpeg.c:3637
#20 0x178bd6cb in transcode () at ffmpeg.c:3680
#21 0x178bdd80 in main (argc=394905545, argv=0xcfbf8bb4) at ffmpeg.c:3856

Original comment by mikolaj....@gmail.com on 15 Jan 2015 at 3:23

GoogleCodeExporter commented 9 years ago
> With patch applied OpenBSD/i386 on KVM virtual machine no longer fails with 
SIGILL, but now it fails with SIGABRT:

That is good. At least the one bug is fixed with the AVX detection as a start.

Original comment by brad.ope...@gmail.com on 15 Jan 2015 at 3:37

GoogleCodeExporter commented 9 years ago
>> With patch applied OpenBSD/i386 on KVM virtual machine no longer fails with
>> SIGILL, but now it fails with SIGABRT:

> That is good. At least the one bug is fixed with the AVX detection as a start.

Good to hear, I'll put up a more complete patch for that soon.

For the SIGABRT try --disable-use-x86inc which is a bit heavy handed, but will 
disable vp9_sub_pixel_* where I think the problem lies.

Original comment by jz...@google.com on 15 Jan 2015 at 4:53

GoogleCodeExporter commented 9 years ago
Adding --disable-use-x86inc to ./configure in libvpx commit 6fc7267 with patch 
mentioned in comment 36 makes ffmpeg encode with vp9 codec on OpenBSD/i386. 
I've tested on virtual machine and on physical box with Atom CPU and both are 
fine.

Original comment by mikolaj....@gmail.com on 15 Jan 2015 at 5:02

GoogleCodeExporter commented 9 years ago
Thanks for testing and being patient through all this!
I posted a fix for the detection [1] and a workaround for this issue [2]. I'll 
file a separate bug for the problematic assembly as this one has gotten a bit 
overloaded.
I was able to reproduce both under qemu (earlier I was using vmware with no 
luck); thanks for the link to the makefile.

[1] https://gerrit.chromium.org/gerrit/#/c/73502/
[2] https://gerrit.chromium.org/gerrit/#/c/73505/

Original comment by jz...@google.com on 15 Jan 2015 at 7:30

GoogleCodeExporter commented 9 years ago
What does --disable-use-x86inc do? I get the feeling that this disables more or 
less all assembly code.

Original comment by brad.ope...@gmail.com on 15 Jan 2015 at 7:56

GoogleCodeExporter commented 9 years ago
x86inc is a helper/framework for writing assembly that came from ffmpeg. At a 
quick glance, it's used for about half of the x86 assembly functions in vp9 but 
none of vp8. However, that estimate includes many high bitdepth functions 
(10/12 bit).

Original comment by johannkoenig@google.com on 15 Jan 2015 at 3:36

GoogleCodeExporter commented 9 years ago
another alternative would be to remove the sse2_x86inc/ssse3_x86inc references 
from vp9/common/vp9_rtcd_defs.pl for this particular target. that would need to 
be a local patch or some platform checking added to that file. I'm hoping this 
can be temporary, but this particular bit of assembly is a bit hard to work 
with as it's macro heavy...

Original comment by jz...@google.com on 15 Jan 2015 at 11:07

GoogleCodeExporter commented 9 years ago
Thank you for the AVX detection fix.

I am using the --disable-use-x86inc option as a workaround for now.

It would be preferable though to fix the ASM code so that it can all be used as 
is.

Original comment by brad.ope...@gmail.com on 16 Jan 2015 at 11:38

GoogleCodeExporter commented 9 years ago
I merged the cpu detection fix a while ago [1] and just pushed the heavy-handed 
configure workaround [2] to avoid problems for normal users. Issue 924 [3] is 
tracking a proper resolution and in the meantime interested parties can use the 
--enable-use-x86inc configure option to enable the code.

[1] 1345836 Merge "fix AVX & AVX2 detection"
[2] 880ca0e Merge "workaround stack bashing by asm on 32-bit OpenBSD"
[3] https://code.google.com/p/webm/issues/detail?id=924

Original comment by jz...@google.com on 23 Jan 2015 at 4:08