Open Fedyon opened 4 years ago
The crash reproduced with -vf expand=::::1 -vo md5sum
too.
Perhaps -vo directx
was not at fault, editing title.
I could reproduce your observations too. Until r38142 everything is fine - from r38143 to the current release it does not work anymore. Maybe an additional message to http://lists.mplayerhq.hu/pipermail/mplayer-users/2020-May/thread.html could help ?
Meanwhile, i've posted this bug to http://lists.mplayerhq.hu/pipermail/mplayer-users/2020-May/088907.html
I've got feeback from mplayer developers. From MPlayer SVN-r38188 this problem should be fixed i could confirm this and have no problems anymore.
Could anyone clarify this ?
please try r38188+g6e1903938b
Thanks everyone. Unfortunately, r38188 still crashes with the same reproducer.
In attempt to reduce factors, I found that the same crash can happen with -demuxer rawvideo -rawvideo w=640:h=480:format=yuy2 -vf expand=::::1 -vo null /some/lengthy/zero
, but it seems less reliable (to crash). It sometimes just run, it sometimes just crashes.
Given that adding a few chars to the command line occasionally makes difference, just a wild guess: some more alignment failure on local stack/heap...?
no unfotunately I need a debug build and gdb attached, did you managed to reproduce consistently with that command?
Well, consistency... perhaps yes. Yes, consistent, but not reliable. I mean...
Full command line:
T:\TEMP\MPlayer-corei7-r38188+g6e1903938b>mplayer -noconfig all -nofontconfig -font mplayer\subfont.ttf -demuxer rawvideo -rawvideo w=640:h=480:format=yuy2 -vf expand=::::1 -vo null -osdlevel 3 ..\zero
By changing only the last argument (note the number of last dots), crash/try:
..\zero
-> 10/10
..\zero.
-> 0/10
..\zero..
-> 0/10
..\zero...
-> 10/10
..\zero....
-> 0/10
..\zero.....
-> 10/10
..\zero......
-> 10/10
..\zero.......
-> 10/10
..\zero........
-> 10/10
..\zero.........
-> 0/10
..\zero..........
-> 0/10
..\zero...........
-> 0/10
..\zero............
-> 10/10
I bet it must also be dependent on cd/env/etc. and the crash pattern itself may not reproduce between machines.
I've compiled mplayer r38188 with some patches from sherpya and create a gdb log file with option used by Fedyon mplayer.zip Maybe it helps.
can someone ping reimar? I'm not subscribed anymore
I've informed him.
I'm not sure these issues are related. That backtrace points to ff_deblock_h_chroma422_8_avx Which means it's not directly OSD related, and it's due to AVX code. Could be that the expand filter provides a destination buffer that isn't sufficiently aligned or something like that. But it could be something completely different as well, as the backtrace seems not really correct: 0x00fe1045 in ff_deblock_h_chroma422_8_avx () at mplayer.c:2390 Which is inconsistent, as the function ff_deblock_h_chroma422_8_avx is clearly not in the file mplayer.c
Also this happens only for 32-bit builds? Could try below patch, to see if it's really the new SSE2 OSD code. If so disabling it seems the easiest solution...
--- a/sub/osd.c
+++ b/sub/osd.c
@@ -63,8 +63,11 @@ static const unsigned long long mask24hl __attribute__((aligned(8))) = 0x0000FF
#endif
#if HAVE_SSE2 || CONFIG_RUNTIME_CPUDETECT
+// crashes on win32 due to alignment issues of unclear cause
+#if !defined(_WIN32) || !ARCH_X86_32
#define COMPILE_SSE2
#endif
+#endif
#endif /* ARCH_X86 */
I aaply the patch. Now i've get compile errors at end of compiling. See attached file compile-errors.txt
Sorry, here's an actually tested patch. (txt extension because diff is not a supported file type for github!!) win32_sse2_osd.txt
No problem. After apply your last patch mplayer don't crash anymore. I tested with the following options: 1) mplayer -noconfig all -nofontconfig -font mplayer\subfont.ttf -osdlevel 3 sample.mp4 2) mplayer -noconfig all -nofontconfig -font mplayer\subfont.ttf -osdlevel 3 -vf expand=::::1 -vo md5sum sample.mp4 3) mplayer -demuxer rawvideo -rawvideo w=640:h=480:format=yuy2 -vf expand=::::1 -vo null sample.mp4 4) mplayer -v dvd:///E: -menu -osdlevel 3 Could this be confirmed ?
By the way: If i apply the last patch to r38186 mplayer also don't crash anymore.
That patch just disables the SSE2 optimization on 32-bit Windows MPlayer. If I could reproduce it and get a proper backtrace I might be able to come up with a better fix, but no luck so far. Possibly adding attribute((force_align_arg_pointer)) to the main function or similar would work even, but not sure. Specifically I mean the below instead of the last patch. But it might also not help at all.
--- a/mplayer.c
+++ b/mplayer.c
@@ -2789,7 +2789,7 @@ static int seek(MPContext *mpctx, double amount, int style)
/* This preprocessor directive is a hack to generate a mplayer-nomain.o object
* file for some tools to link against. */
#ifndef DISABLE_MAIN
-int main(int argc, char *argv[])
+int main(int argc, char *argv[]) __attribute__((force_align_arg_pointer))
{
int opt_exit = 0; // Flag indicating whether MPlayer should exit without playing anything.
int profile_config_loaded;
I believe the crash point is: https://github.com/sherpya/MPlayer/blob/fee36a534aee8fce8311c281ef049c3f51e9a789/sub/osd_template.c#L262 in particular the argument mmsrc, which is in turn https://github.com/sherpya/MPlayer/blob/fee36a534aee8fce8311c281ef049c3f51e9a789/sub/osd_template.c#L258
Disasm on MPlayer-corei7-r38188+g6e1903938b gave this:
66 0f d7 c8 pmovmskb ecx,xmm0
81 f9 ff ff 00 00 cmp ecx,0xffff
0f 84 a6 00 00 00 je 0xb6
f3 0f 6f eb movdqu xmm5,xmm3
f3 0f 6f 0c 42 movdqu xmm1,XMMWORD PTR [edx+eax*2]
66 0f 68 da punpckhbw xmm3,xmm2
66 0f 60 ea punpcklbw xmm5,xmm2
f3 0f 6f 44 42 10 movdqu xmm0,XMMWORD PTR [edx+eax*2+0x10]
f3 0f 6f 15 20 d0 9f 01 movdqu xmm2,XMMWORD PTR ds:0x19fd020
66 0f ef cf pxor xmm1,xmm7
f3 0f 6f 25 20 d0 9f 01 movdqu xmm4,XMMWORD PTR ds:0x19fd020
66 0f ef c7 pxor xmm0,xmm7
f3 0f 6f f0 movdqu xmm6,xmm0
66 0f db d1 pand xmm2,xmm1
66 0f e4 d5 pmulhuw xmm2,xmm5
66 0f db e0 pand xmm4,xmm0
66 0f e4 e3 pmulhuw xmm4,xmm3
66 0f 71 e6 08 psraw xmm6,0x8
66 0f e5 f3 pmulhw xmm6,xmm3
66 0f 67 d4 packuswb xmm2,xmm4
f3 0f 6f e1 movdqu xmm4,xmm1
66 0f fc 14 03 paddb xmm2,XMMWORD PTR [ebx+eax*1] ; <- FAULT
; eax=00000000 ebx=0B7ECD98 ecx=000083FF edx=0B943C68
; esi=00000170 edi=0B7F23F0 ebp=0022EC58 esp=0022EC20
66 0f 71 e4 08 psraw xmm4,0x8
66 0f e5 e5 pmulhw xmm4,xmm5
66 0f 63 e6 packsswb xmm4,xmm6
f3 0f 6f f2 movdqu xmm6,xmm2
66 0f 68 d4 punpckhbw xmm2,xmm4
66 0f 60 f4 punpcklbw xmm6,xmm4
But I currently don't have proper development environment on this machine, and as such I looked up just by guess without gdb/symtbl, I might got wrong in the way...
Result of guess-stack-rewinding.
vo_draw_alpha_yuy2_sse( 00000170, 0000003C, 0B7ECAB8, 0B7F2110, 00000170, 0B943268, 00000500 )
vo_draw_text_ext( 00000280, 000001E0, 00000000, 00000000, 00000000, 00000000, 00000280, 000001E0, 00484010 )
That problematic src
should have come from obj
there, so I tracked:
*vo_osd_list [020E0440] = 0B7BB440
mp_osd_obj_t {
+00000000 .next
+00001984 .stride
+00001988 .allocated
+0000198C .alpha_buffer
+00001990 .bitmap_buffer
}
(mp_osd_obj_t)[0B7BB440] = { .stride=00000000, .allocated=FFFFFFFF, .alpha_buffer=00000000, .bitmap_buffer=00000000 }
(mp_osd_obj_t)[0B7B9A90] = { .stride=00000000, .allocated=00000000, .alpha_buffer=0BB198B0, .bitmap_buffer=0BB19890 }
(mp_osd_obj_t)[0B7B80E0] = { .stride=00000000, .allocated=FFFFFFFF, .alpha_buffer=00000000, .bitmap_buffer=00000000 }
(mp_osd_obj_t)[0B7B6730] = { .stride=00000000, .allocated=FFFFFFFF, .alpha_buffer=00000000, .bitmap_buffer=00000000 }
(mp_osd_obj_t)[0B7B4D80] = { .stride=00000000, .allocated=FFFFFFFF, .alpha_buffer=00000000, .bitmap_buffer=00000000 }
(mp_osd_obj_t)[0B7B33D0] = { .stride=00000170, .allocated=00005640, .alpha_buffer=0B7F2110, .bitmap_buffer=0B7ECAB8 }
... and so now I'm very confused. If I read correctly those buffers are only allocated via
https://github.com/sherpya/MPlayer/blob/fee36a534aee8fce8311c281ef049c3f51e9a789/sub/sub.c#L161
and there memalign
is supposed to provide align, am I right?????? What's happening???????
Wait, something is wrong. I think I found the compiled alloc_buf
, but in there supposed-to-be-memalign is actually just plain MSVCRT.malloc
!
00522750 899388190000 mov [ebx+00001988],edx
00522756 893C24 mov [esp],edi
00522759 E88ABE2B01 call MSVCRT.free
0052275E 8B838C190000 mov eax,[ebx+0000198C]
00522764 890424 mov [esp],eax
00522767 E87CBE2B01 call MSVCRT.free
0052276C 893424 mov [esp],esi
0052276F E80CBE2B01 call MSVCRT.malloc
00522774 893424 mov [esp],esi
00522777 898390190000 mov [ebx+00001990],eax
0052277D 89C7 mov edi,eax
0052277F E8FCBD2B01 call MSVCRT.malloc
00522784 89838C190000 mov [ebx+0000198C],eax
0052278A E960FFFFFF jmp MPLAYER.005226EF
edit: what is this https://github.com/sherpya/MPlayer/blob/fee36a534aee8fce8311c281ef049c3f51e9a789/configure#L4011
@Fedyon
internal aligned malloc functions are implemented using malloc (with a trick) but it works
@sherpya I'm afraid it looks like the trick is not here, for whatever reason.
I believe this is the whole of alloc_buf
https://github.com/sherpya/MPlayer/blob/fee36a534aee8fce8311c281ef049c3f51e9a789/sub/sub.c#L150
005226B0 57 push edi
005226B1 56 push esi
005226B2 53 push ebx
005226B3 89C3 mov ebx,eax
005226B5 83EC10 sub esp,00000010
005226B8 8B4020 mov eax,[eax+20]
005226BB 8B5318 mov edx,[ebx+18]
005226BE 39D0 cmp eax,edx
005226C0 7C6E jl MPLAYER.00522730 (00522730)
005226C2 29D0 sub eax,edx
005226C4 8B4B1C mov ecx,[ebx+1C]
005226C7 8B5324 mov edx,[ebx+24]
005226CA 83C00F add eax,0000000F
005226CD 83E0F0 and eax,FFFFFFF0
005226D0 39CA cmp edx,ecx
005226D2 7C6B jl MPLAYER.0052273F (0052273F)
005226D4 29CA sub edx,ecx
005226D6 0FAFD0 imul edx,eax
005226D9 89D6 mov esi,edx
005226DB 8BBB90190000 mov edi,[ebx+00001990]
005226E1 898384190000 mov [ebx+00001984],eax
005226E7 399388190000 cmp [ebx+00001988],edx
005226ED 7C61 jl MPLAYER.00522750 (00522750)
005226EF A158040E02 mov eax,[020E0458]
005226F4 89742408 mov [esp+08],esi
005226F8 893C24 mov [esp],edi
005226FB 89442404 mov [esp+04],eax
005226FF E84CBE2B01 call MSVCRT.memset
00522704 A154040E02 mov eax,[020E0454]
00522709 89742408 mov [esp+08],esi
0052270D 89442404 mov [esp+04],eax
00522711 8B838C190000 mov eax,[ebx+0000198C]
00522717 890424 mov [esp],eax
0052271A E831BE2B01 call MSVCRT.memset
0052271F 83C410 add esp,00000010
00522722 5B pop ebx
00522723 5E pop esi
00522724 5F pop edi
00522725 C3 ret
00522726 8DB42600000000 lea esi,[esi]
0052272D 8D7600 lea esi,[esi]
00522730 895320 mov [ebx+20],edx
00522733 8B4B1C mov ecx,[ebx+1C]
00522736 31C0 xor eax,eax
00522738 8B5324 mov edx,[ebx+24]
0052273B 39CA cmp edx,ecx
0052273D 7D95 jnl MPLAYER.005226D4 (005226D4)
0052273F 894B24 mov [ebx+24],ecx
00522742 31D2 xor edx,edx
00522744 31F6 xor esi,esi
00522746 EB93 jmp MPLAYER.005226DB (005226DB)
00522748 8DB42600000000 lea esi,[esi]
0052274F 90 nop
00522750 899388190000 mov [ebx+00001988],edx
00522756 893C24 mov [esp],edi
00522759 E88ABE2B01 call MSVCRT.free
0052275E 8B838C190000 mov eax,[ebx+0000198C]
00522764 890424 mov [esp],eax
00522767 E87CBE2B01 call MSVCRT.free
0052276C 893424 mov [esp],esi
0052276F E80CBE2B01 call MSVCRT.malloc
00522774 893424 mov [esp],esi
00522777 898390190000 mov [ebx+00001990],eax
0052277D 89C7 mov edi,eax
0052277F E8FCBD2B01 call MSVCRT.malloc
00522784 89838C190000 mov [ebx+0000198C],eax
0052278A E960FFFFFF jmp MPLAYER.005226EF
I see the align size value (=16) only in esp
adjustment. No one is passing value 16 to anywhere, nothing is around calls to MSVCRT.malloc, those calls actually reaches to MSVCRT... or this MSVCRT is patched on memory or something like that?
Note that the only rounding operation at 005226CA~CD corresponds to https://github.com/sherpya/MPlayer/blob/fee36a534aee8fce8311c281ef049c3f51e9a789/sub/sub.c#L155 and I'm pretty sure this is not related to *_buffer
align.
edit: after some failures I managed to see breakpoint on 00522777 caught storing 0B7ECAB8
, this is definitly not aligned.
The fallback in case of MPlayer's memalign use is actually plain malloc, so it does not actually work if the libc does not provide properly aligned memory. I attached the patch I just sent to the MPlayer mailing list which hopefully will fix this. 0001-Remove-all-usage-of-memalign.txt If you can confirm, I can re-enabled the optimization on Win32 again.
I make a clean svn checkout from current release (r38191) and compile it. In this release mplayer didn't crashed anymore. The above patch does not make a difference in r38191 about crash problems - no mplayer crash.
If i compile r38190 (before the SSE2 optimization on 32-bit Windows MPlayer was disabled) and use the above patch in some cases mplayer crashes.
I just realized that I overlooked this, it certainly can break src
alignment:
https://github.com/sherpya/MPlayer/blob/fee36a534aee8fce8311c281ef049c3f51e9a789/libmpcodecs/vf_expand.c#L130
but I'm not sure what this path is doing at the first place...
(rewrote; dst didn't matter, we use unaligned stores)
While that expand code looks problematic, it should only trigger when you actually add borders with the expand filters, which none of the crashing example commands mentioned do... Also it would already break 8-byte alignment, so is problematic even with the MMX code I believe. So this doesn't really look like it should be the cause? As to what that path does: When the OSD did not change, there is no point in first clearing and then drawing it again onto the expanded black borders, instead the borders can simply be left as-is.
Ah I see, so adding side borders is breaking them all. Thanks for the explanation.
With MPlayer-corei7-r38188+g6e1903938b, I simulated the fix by setting breakpoints on 00522777 and 00522784 to make them retry malloc until it get an aligned memory (just by chance). I confirm the trick makes all the reproducers not to crash, and by disabling breakpoints they crash again. So I believe it is solving a large part of problems, at least.
(I'm severely running out of local disk space on the machine and can't have the build environment extracted. Yuck.)
I'm a bit unsure about the status, if there are still issues or not (when re-enabling the optimization). As to the vf_expand code that is just an optimization, so at the cost of slightly higher CPU usage when adding black borders and the OSD being drawn on those borders and not changing, the below should avoid any issues due to that specific code by disabling it:
--- a/libmpcodecs/vf_expand.c
+++ b/libmpcodecs/vf_expand.c
@@ -84,7 +84,7 @@ static void remove_func_2(int x0,int y0, int w,int h){
}
static void remove_func(int x0,int y0, int w,int h){
- if(!vo_osd_changed_flag) return;
+ if(0 && !vo_osd_changed_flag) return;
// split it to 4 parts:
if(y0<vf->priv->exp_y){
// it has parts above the image:
@@ -122,7 +122,7 @@ static void remove_func(int x0,int y0, int w,int h){
static void draw_func(int x0,int y0, int w,int h,unsigned char* src, unsigned char *srca, int stride){
unsigned char* dst;
- if(!vo_osd_changed_flag && vf->dmpi->planes[0]==vf->priv->fb_ptr){
+ if(0 && !vo_osd_changed_flag && vf->dmpi->planes[0]==vf->priv->fb_ptr){
// ok, enough to update the area inside the video, leave the black bands
// untouched!
if(x0<vf->priv->exp_x){
Needs to be refined a bit more before merging though
From my point of view i could not reproduce crashes with mplayer r38192 anymore. Also with the patch above. I tested with the following options:
mplayer -noconfig all -nofontconfig -font mplayer\subfont.ttf -osdlevel 3 test.mp4
mplayer -noconfig all -nofontconfig -font mplayer\subfont.ttf -osdlevel 3 -vf expand=::::1 -vo md5sum test.mp4
mplayer -demuxer rawvideo -rawvideo w=640:h=480:format=yuy2 -vf expand=::::1 -vo null test.mp4
mplayer -v dvd:///E: -menu -osdlevel 3
mplayer -border-pos-x 0.9 -border-pos-y 0.2 -vo direct3d test.mp4
After all, it's free to continue using MMX in 32bit builds. Given the performance gain between MMX and SSE2 being pretty marginal (especially in nowadays normal playback scenario), I feel okay-ish to mark this closed at the current status as-is. At the same time I'm not particularly opposing to re-enable SSE2; it should work fine either way now.
Reproducer:
mplayer -noconfig all -nofontconfig -font mplayer\subfont.ttf -osdlevel 3 sample.avi
-osdlevel 0
), it doesn't crash.-vo direct3d
seems much less likely to crash.Last-known-good:
Anything since r38152 (including) seems broken:
Systems:
Suspects: