Closed Draghmar closed 5 years ago
I tested encoding a clip with the same resolution and parameters and had no problem, so will need a sample to try to reproduce.
Could you test if it works if you try using 1 tile instead of 4? That is my first suspicion for what might be causing this.
@rzumer I was afraid it may have something to do with this specific file. And I can't unfortunately share it because that's some internal thing. :( That's why I asked about some way to debug this, to get an idea why it chooses to stop.
@shssoichiro Nope, it didn't work.
I've made some additional tests by using different video with similar parameters (fullHD, not long etc.) and put it through the same pipeline, which is:
ffmpeg -i "%%a" -preset slow -crf 0 -tune film -vf scale=-1:872,crop=550:in_h "in\%%~na_temp.mp4"
ffmpeg -i "%%a" "in\%%~na.y4m"
rav1e -v -s 1 --threads 4 --tile-rows 4 --quantizer 150 "%%a" -o "out\%%~na.ivf"
And the result are the same like with the previous file - encoding stops. Changing -crf doesn't make any difference. The weirdest part is that I tried ffmpeg to econde into AV1 and it stops always at 53 frame. And it work properly some time ago when I was playing with AV1 from ffmpeg, rav1e and SvtAv1Enc.
I will make another attempt when my friend that made this video show up. I will ask him to render it directly to 550x872 to remove ffmpeg from the variables.
I got 550x872 mp4 file rendered by Premiere Pro that I converted to y4m with ffmpeg and rav1e still won't encode it. The same procedure works for 1920x1080. So I have no idea what is going on. Especially that I took some stock fullHD clip and do the same thing with it getting the same effect. This one I can attach so here you go. Three files from the three steps: original, cropped and y4m.
I tried to encode the y4m from your zipfile:
rav1e -v -s 1 --threads 4 --tile-rows 4 --quantizer 150 Pexels_Videos_1572432_mobile.y4m -o test.ivf
It works for me.
INFO rav1e > Using y4m decoder: 550x872p @ 24000/1001 fps, 4:2:0, 8-bit
INFO rav1e > Encoding settings: keyint_min=12 keyint_max=240 quantizer=150 bitrate=0 min_quantizer=0 low_latency=false tune=Psychovisual rdo_lookahead_frames=40 min_block_size=4x4 multiref=true fast_deblock=false reduced_tx_set=false tx_domain_distortion=false tx_domain_rate=false encode_bottomup=true rdo_tx_decision=true prediction_modes=Complex-All include_near_mvs=true no_scene_detection=false diamond_me=true cdef=true quantizer_rdo=true use_satd_subpel=true non_square_partition=false
INFO rav1e::api::config > CPU Feature Level: AVX2
Only happens for SSSE3?
Hm...it may be because inspired by your suggestion I did test that stock clip on my tablet that has Intel(R) Core(TM) m3-6Y30 CPU @ 0.90GHz (it's not as slow as one may think XD) and rav1e uses AVX2 for CPU Feature Level here. Is there a way to change/enforce which CPU Feature Level will be used?
You can set the "RAV1E_CPU_TARGET" environment variable (set to "rust" to disable all ASM), although it won't consistently apply to all ASM until #1717 gets merged.
The issue seems to be with one of the SSE2 functions which we also have an AVX function for. I ran this against the #1717 branch:
➜ RAV1E_CPU_TARGET=avx2 cargo test --release --features decode_test_dav1d,check_asm
running 149 tests
...
test result: ok. 130 passed; 0 failed; 19 ignored; 0 measured; 0 filtered out
➜ RAV1E_CPU_TARGET=sse2 cargo test --release --features decode_test_dav1d,check_asm
running 149 tests
test api::test::large_width_assert ... ok
...
test test_encode_decode::low_bit_depth_dav1d ... ignored
test test_encode_decode::odd_size_frame_with_full_rdo_dav1d ... ignored
error: process didn't exit successfully: `/Users/jholmer/repos/rav1e/target/release/deps/rav1e-28480a92d3ad6253` (signal: 11, SIGSEGV: invalid memory reference)
➜ RAV1E_CPU_TARGET=ssse3 cargo test --release --features decode_test_dav1d,check_asm
running 149 tests
test api::test::large_width_assert ... ok
...
test test_encode_decode::low_bit_depth_dav1d ... ignored
test test_encode_decode::odd_size_frame_with_full_rdo_dav1d ... ignored
error: process didn't exit successfully: `/Users/jholmer/repos/rav1e/target/release/deps/rav1e-28480a92d3ad6253` (signal: 11, SIGSEGV: invalid memory reference)
➜ RAV1E_CPU_TARGET=rust cargo test --release --features decode_test_dav1d,check_asm
running 149 tests
...
test result: ok. 130 passed; 0 failed; 19 ignored; 0 measured; 0 filtered out
I've tracked the segfault as best I can down to rav1e_sad16x16_sse2
. It does not crash on every call to it, only with certain input data. This can be reproduced against the test test_encode_decode::quantizer_100_dav1d
.
I was able to produce a backtrace, below, off of my branch for #1717. I also determined that the condition causing this to happen is when the width of the video is not a multiple of 64. The height of the video does not have any effect. My ASM skills are very lacking, so I'm unclear as to how to fix it, but I get the feeling someone familiar with ASM could fix this easily.
rav1e-0afb1c11afed9f10`rav1e_sad16x16_sse2.loop
-> 0x555555d7a8db <+20>: psadbw (%rdi), %xmm1
frame #1: 0x00005555559457b4 rav1e-0afb1c11afed9f10`rav1e::me::full_search::h6a2ee690f2d8e627 at dist.rs:122:8
119 let dist = match T::type_enum() {
120 PixelType::U8 => match SAD_FNS[cpu.as_index()][to_index(bsize)] {
121 Some(func) => unsafe {
-> 122 (func)(
123 src.data_ptr() as *const _,
124 T::to_asm_stride(src.plane_cfg.stride),
125 dst.data_ptr() as *const _,
frame #2: 0x0000555555945226 rav1e-0afb1c11afed9f10`rav1e::me::full_search::h6a2ee690f2d8e627(fi=&0x7ffff014eb10, x_lo=-20, x_hi=22, y_lo=-16, y_hi=16, bsize=BLOCK_16X16, p_org=&0x7ffff7b0d5f0, p_ref=&0x7ffff00c67d8, best_mv=&0x7ffff7b0a248, lowest_cost=&0x7ffff7b0a240, po=PlaneOffset {
x: 2,
y: 0
}, step=1, lambda=0, pmv=[MotionVector {
row: 0,
col: 0
}, MotionVector {
row: 0,
col: 0
}], allow_high_precision_mv=false) at me.rs:976
973 // Select rectangular regions within search region with vert+horz windows
974 for vert_window in search_region.vert_windows(blk_h).step_by(step) {
975 for ref_window in vert_window.horz_windows(blk_w).step_by(step) {
-> 976 let sad = get_sad(
977 &plane_org,
978 &ref_window,
979 bsize,
frame #3: 0x0000555555949bb6 rav1e-0afb1c11afed9f10`rav1e::me::estimate_motion_ss4::h2c1ffaf9be5098c2(fi=&0x7ffff014eb10, ts=&0x7ffff7b0ad68, bsize=BLOCK_64X64, ref_idx=0, tile_bo=TileBlockOffset(BlockOffset {
x: 16,
y: 0
})) at me.rs:1065:4
1062 // Divide by 16 to account for subsampling, 0.125 is a fudge factor
1063 let lambda = (fi.me_lambda * 256.0 / 16.0 * 0.125) as u32;
1064
-> 1065 full_search(
1066 fi,
1067 x_lo,
1068 x_hi,
frame #4: 0x00005555558b789e rav1e-0afb1c11afed9f10`rav1e::api::internal::ContextInner$LT$T$GT$::compute_lookahead_motion_vectors::_$u7b$$u7b$closure$u7d$$u7d$::h719b05ade4c22c28 at encoder.rs:2750:14
2747 let r = fi.ref_frames[i] as usize;
2748 if pmvs[r].is_none() {
2749 pmvs[r] =
-> 2750 estimate_motion_ss4(fi, ts, BlockSize::BLOCK_64X64, r, bo);
2751 }
2752 }
2753 frame_pmvs.push(pmvs);
frame #5: 0x00005555558b72b4 rav1e-0afb1c11afed9f10`rav1e::api::internal::ContextInner$LT$T$GT$::compute_lookahead_motion_vectors::_$u7b$$u7b$closure$u7d$$u7d$::h719b05ade4c22c28(ctx=TileContextMut<u8> @ 0x00007ffff7b0ad68) at internal.rs:596
593 let ts = &mut ctx.ts;
594
595 // Compute the quarter-resolution motion vectors.
-> 596 let tile_pmvs = build_coarse_pmvs(fi, ts, &inter_cfg);
597
598 // Compute the half-resolution motion vectors.
599 let mut half_res_pmvs = Vec::with_capacity(ts.sb_height * ts.sb_width);
Isn't this now resolved by #1758?
The crash is resolved, but the ASM is still broken. I've removed this as a blocker on the release, but I'm leaving this open to represent fixing and reenabling the broken ASM function.
The crash is resolved, but the ASM is still broken. I've removed this as a blocker on the release, but I'm leaving this open to represent fixing and reenabling the broken ASM function.
Right, thanks for revising the title of issue.
Hi I was doing some tests with rav1e windows build 0.1.0 (20190926-19-ge11f06a) and found one video that I couldn't encode. The process started as usual with:
and stopped here without saying why. Output was not readable by media player and had only 32KB. Is there some debug option I could turn on to get what's happening? The parameters I used are:
-v -s 1 --threads 4 --tile-rows 4 --quantizer 150
I tried changing -s and tiles.Also worth mention is the fact I have the same video but in 1920x1080 and it works just fine. The one here is simply resized and cropped. Maybe it is somehow related to the resolution? Like SvtAv1Enc that works only with dimensions that can be divided by 8?