chromiumembedded / cef

Chromium Embedded Framework (CEF). A simple framework for embedding Chromium-based browsers in other applications.
https://bitbucket.org/chromiumembedded/cef/
Other
3.32k stars 464 forks source link

Windows: 2785+: Crash running x64 build on processors that do not support vmovaps #1999

Closed magreenblatt closed 7 years ago

magreenblatt commented 8 years ago

Original report by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


Environment to reproduce:

cefclient starts like normal, creates a renderer process and immediately after renderer process created it crashed. No any error/fatal entries in log appears. There is impossible to catch error in any way except crash dump.

I'm created a dump file via WER (whoa, it is work this time), and got next results:

#!code

(2344.24c4): Illegal instruction - code [c000001d (bb)](https://bitbucket.org/chromiumembedded/cef/commits/c000001d) (first/second chance not available)
ntdll!NtWaitForMultipleObjects+0xa:
00007ffc`04a00c6a c3              ret

0:000> .ecxr
*** WARNING: Unable to verify checksum for libcef.dll
rax=0000000000000003 rbx=0000003a1d48503c rcx=0000003a16e3c010
rdx=0000003a1d48503c rsi=0000003a16e3c130 rdi=0000003a16e3c140
rip=00007ffbcc99045c rsp=0000003a16e3bfb8 rbp=0000003a16e3c029
r8=0000003a16e3c130  r9=0000000000000002 r10=0000003a1d48503c
r11=0000003a16e3c130 r12=0000000000000000 r13=0000003a1b4110d0
r14=0000003a1d48503c r15=0000003a16e3c500
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
libcef!SkNx<4,float>::SkNx<4,float>:
00007ffb`cc99045c [c5f828c1 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/c5f828c1)        vmovaps xmm0,xmm1

Stack Trace:

.  0  Id: 2344.24c4 Suspend: 0 Teb: [00007ff6 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ff6)`eddee000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3a8d8 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`01e513ed ntdll!NtWaitForMultipleObjects+0xa
01 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3a8e0 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`03d27d51 KERNELBASE!WaitForMultipleObjectsEx+0xe1
02 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3abc0 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`03d27773 kernel32!WerpReportFaultInternal+0x581
03 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3b130 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`01f31fdf kernel32!WerpReportFault+0x83
04 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3b160 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`04a0f133 KERNELBASE!UnhandledExceptionFilter+0x23f
05 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3b250 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`049f1d86 ntdll!RtlUserThreadStart$filt$0+0x3e
06 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3b290 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`04a033fd ntdll!_C_specific_handler+0x96
07 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3b300 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`049c4847 ntdll!RtlpExecuteHandlerForException+0xd
08 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3b330 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`04a0258a ntdll!RtlDispatchException+0x197
09 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3ba00 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cc99045c ntdll!KiUserExceptionDispatch+0x3a
0a [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3bfb8 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ceb6b495 libcef!SkNx<4,float>::SkNx<4,float>(float a = 3.621263742e-036, float b = 0, float c = 0, float d = -3.621951103) [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\opts\sknx_sse.h @ 73]
0b [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3bfc0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ceb6bb0f libcef!SkMatrix::Scale_pts(class SkMatrix * m = <Value unavailable error>, struct SkPoint * dst = 0x00000000`00000040, struct SkPoint * src = 0x00000000`00001fa0, int count = 0n384024536)+0x71 [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skmatrix.cpp @ 960]
0c (Inline Function) --------`-------- libcef!SkMatrix::mapPoints+0x34 [h:\cef\build\chromium_git\chromium\src\third_party\skia\include\core\skmatrix.h @ 436]
0d [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3c090 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ceb52d12 libcef!SkMatrix::mapRect(struct SkRect * dst = 0x0000003a`1d48503c, struct SkRect * src = 0x0000003a`16e3c130)+0x6f [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skmatrix.cpp @ 1105]
0e [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3c100 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ceb52bab libcef!SkCanvas::getClipBounds(struct SkRect * bounds = 0x0000003a`1d48503c)+0xee [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skcanvas.cpp @ 1855]
0f (Inline Function) --------`-------- libcef!SkCanvas::getLocalClipBounds+0x1c [h:\cef\build\chromium_git\chromium\src\third_party\skia\include\core\skcanvas.h @ 1521]
10 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3c190 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ceb53c42 libcef!SkCanvas::quickReject(struct SkRect * rect = 0x0000003a`16e3c2f0)+0x167 [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skcanvas.cpp @ 1813]
11 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3c200 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cead2980 libcef!SkCanvas::onDrawRect(struct SkRect * r = 0x0000003a`16e3c500, class SkPaint * paint = 0x0000003a`1b412940)+0x156 [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skcanvas.cpp @ 2139]
12 (Inline Function) --------`-------- libcef!SkCanvas::drawRect+0x1f [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skcanvas.cpp @ 1919]
13 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3c4e0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cead1f4c libcef!cc::SoftwareRenderer::DrawSolidColorQuad(class cc::SolidColorDrawQuad * quad = 0x0000003a`1d483e38)+0x1f0 [h:\cef\build\chromium_git\chromium\src\cc\output\software_renderer.cc @ 405]
14 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3c5c0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ceb112a4 libcef!cc::SoftwareRenderer::DoDrawQuad(struct cc::DirectRenderer::DrawingFrame * frame = 0x0000003a`16e3cc00, class cc::DrawQuad * quad = 0x0000003a`1d483e38, class gfx::QuadF * draw_region = 0x00000000`00000000)+0x5dc [h:\cef\build\chromium_git\chromium\src\cc\output\software_renderer.cc @ 311]
15 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3c870 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ceb10bc9 libcef!cc::DirectRenderer::DrawRenderPass(struct cc::DirectRenderer::DrawingFrame * frame = 0x0000003a`16e3cc00, class cc::RenderPass * render_pass = 0x0000003a`1d466540)+0x664 [h:\cef\build\chromium_git\chromium\src\cc\output\direct_renderer.cc @ 499]
16 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3ca10 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ceb102a8 libcef!cc::DirectRenderer::DrawRenderPassAndExecuteCopyRequests(struct cc::DirectRenderer::DrawingFrame * frame = 0x0000003a`16e3cc00, class cc::RenderPass * render_pass = 0x0000003a`1d466540)+0xa9 [h:\cef\build\chromium_git\chromium\src\cc\output\direct_renderer.cc @ 430]
17 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3ca50 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cf3038d2 libcef!cc::DirectRenderer::DrawFrame(class std::vector<std::unique_ptr<cc::RenderPass,std::default_delete<cc::RenderPass> >,std::allocator<std::unique_ptr<cc::RenderPass,std::default_delete<cc::RenderPass> > > > * render_passes_in_draw_order = 0x0000003a`1b591cd8 { size=1 }, float device_scale_factor = <Value unavailable error>, class gfx::ColorSpace * device_color_space = 0x0000003a`1b411118, class gfx::Rect * device_viewport_rect = 0x0000003a`16e3ce48, class gfx::Rect * device_clip_rect = 0x0000003a`16e3cee8, bool disable_picture_quad_image_filtering = false)+0x698 [h:\cef\build\chromium_git\chromium\src\cc\output\direct_renderer.cc @ 281]
18 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3cd90 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cf305774 libcef!cc::Display::DrawAndSwap(void)+0x44a [h:\cef\build\chromium_git\chromium\src\cc\surfaces\display.cc @ 301]
19 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3d120 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cf305fe7 libcef!cc::DisplayScheduler::DrawAndSwap(void)+0x84 [h:\cef\build\chromium_git\chromium\src\cc\surfaces\display_scheduler.cc @ 118]
1a [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3d1e0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cf3060c1 libcef!cc::DisplayScheduler::AttemptDrawAndSwap(void)+0x73 [h:\cef\build\chromium_git\chromium\src\cc\surfaces\display_scheduler.cc @ 275]
1b [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3d210 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cc9799f2 libcef!cc::DisplayScheduler::OnBeginFrameDeadline(void)+0x79 [h:\cef\build\chromium_git\chromium\src\cc\surfaces\display_scheduler.cc @ 294]
1c (Inline Function) --------`-------- libcef!base::internal::RunnableAdapter<void +0x11 [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 171]
1d (Inline Function) --------`-------- libcef!base::internal::InvokeHelper<1,void>::MakeItSo+0x2d [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 309]
1e (Inline Function) --------`-------- libcef!base::internal::Invoker<base::internal::BindState<base::internal::RunnableAdapter<void +0x2d [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 363]
1f [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3d280 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cc9799f2 libcef!base::internal::Invoker<base::internal::BindState<base::internal::RunnableAdapter<void (class base::internal::BindStateBase * base = 0x00007ffc`0499da0b)+0x3a [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 346]
20 (Inline Function) --------`-------- libcef!base::internal::RunnableAdapter<void +0x11 [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 171]
21 (Inline Function) --------`-------- libcef!base::internal::InvokeHelper<1,void>::MakeItSo+0x2d [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 309]
22 (Inline Function) --------`-------- libcef!base::internal::Invoker<base::internal::BindState<base::internal::RunnableAdapter<void +0x2d [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 363]
23 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3d2b0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ce9c5214 libcef!base::internal::Invoker<base::internal::BindState<base::internal::RunnableAdapter<void (class base::internal::BindStateBase * base = 0x00007ffc`0499da0b)+0x3a [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 346]
24 (Inline Function) --------`-------- libcef!base::Callback<void __cdecl+0x8 [h:\cef\build\chromium_git\chromium\src\base\callback.h @ 389]
25 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3d2e0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ce95e763 libcef!base::debug::TaskAnnotator::RunTask(char * queue_function = 0x00007ffb`cff71798 "MessageLoop::PostTask", struct base::PendingTask * pending_task = 0x0000003a`16e3e700)+0x184 [h:\cef\build\chromium_git\chromium\src\base\debug\task_annotator.cc @ 53]
26 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3d3d0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ce95f6e9 libcef!base::MessageLoop::RunTask(struct base::PendingTask * pending_task = 0x0000003a`16e3e700)+0x453 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 491]
27 (Inline Function) --------`-------- libcef!base::MessageLoop::DeferOrRunPendingTask+0x181 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 499]
28 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3e6e0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ce9b6a91 libcef!base::MessageLoop::DoWork(void)+0x4a9 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 622]
29 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3eab0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ce9b6764 libcef!base::MessagePumpForUI::DoRunLoop(void)+0x71 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_pump_win.cc @ 263]
2a [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3eb20 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ce9a4f3d libcef!base::MessagePumpWin::Run(class base::MessagePump::Delegate * delegate = <Value unavailable error>)+0x54 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_pump_win.cc @ 142]
2b (Inline Function) --------`-------- libcef!base::MessageLoop::RunHandler+0x15 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 454]
2c [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3eb70 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`ce95dc31 libcef!base::RunLoop::Run(void)+0xed [h:\cef\build\chromium_git\chromium\src\base\run_loop.cc @ 36]
2d [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3ebc0 [00007ffb (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffb)`cc938a3d libcef!base::MessageLoop::Run(void)+0x41 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 290]
*** WARNING: Unable to verify checksum for cefclient.exe
2e (Inline Function) --------`-------- libcef!CefBrowserMessageLoop::RunMessageLoop+0x8 [h:\cef\build\chromium_git\chromium\src\cef\libcef\browser\browser_message_loop.cc @ 126]
2f (Inline Function) --------`-------- libcef!CefRunMessageLoop+0x3b [h:\cef\build\chromium_git\chromium\src\cef\libcef\browser\context.cc @ 206]
30 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3ec20 [00007ff6 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ff6)`ee482546 libcef!cef_run_message_loop(void)+0x41 [h:\cef\build\chromium_git\chromium\src\cef\libcef_dll\libcef_dll.cc @ 351]
31 (Inline Function) --------`-------- cefclient!CefRunMessageLoop+0x6 [h:\cef\build\chromium_git\chromium\src\cef\libcef_dll\wrapper\libcef_dll_wrapper.cc @ 342]
32 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3ec50 [00007ff6 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ff6)`ee4a965e cefclient!client::MainMessageLoopStd::Run(void)+0xa [h:\cef\build\chromium_git\chromium\src\cef\tests\cefclient\browser\main_message_loop_std.cc @ 16]
33 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3ec80 [00007ff6 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ff6)`ee4fd5a3 cefclient!client::`anonymous namespace'::RunMain(struct HINSTANCE__ * hInstance = <Value unavailable error>)+0x6d2 [h:\cef\build\chromium_git\chromium\src\cef\tests\cefclient\cefclient_win.cc @ 106]
34 (Inline Function) --------`-------- cefclient!invoke_main+0x21 [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 113]
35 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3f740 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`03c213d2 cefclient!__scrt_common_main_seh(void)+0x117 [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 253]
36 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3f780 [00007ffc (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00007ffc)`049854e4 kernel32!BaseThreadInitThunk+0x22
37 [0000003a (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000003a)`16e3f7b0 [00000000 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/00000000)`00000000 ntdll!RtlUserThreadStart+0x34

Faulted code:

--- h:\cef\build\chromium_git\chromium\src\third_party\skia\src\opts\sknx_sse.h 
    71:     static SkNx Load(const void* ptr) { return _mm_loadu_ps((const float*)ptr); }
    72: 
    73:     SkNx(float a, float b, float c, float d) : fVec(_mm_setr_ps(a,b,c,d)) {}
00007FFBCC99045C C5 F8 28 C1          vmovaps     xmm0,xmm1  
00007FFBCC990460 C4 E3 79 21 C2 10    vinsertps   xmm0,xmm0,xmm2,10h  
00007FFBCC990466 C4 E3 79 21 C3 20    vinsertps   xmm0,xmm0,xmm3,20h  
00007FFBCC99046C C4 E3 79 21 44 24 28 30 vinsertps   xmm0,xmm0,dword ptr [d],30h  
00007FFBCC990474 C5 F8 11 01          vmovups     xmmword ptr [rcx],xmm0  
00007FFBCC990478 48 8B C1             mov         rax,rcx  
00007FFBCC99047B C3                   ret  

I'm not sure what happens: CPU is not support SSE2 command, or command really invalid? CPUID say that it is support even SSE4.1... Also on same host x86 build work, and on i7-4770 x64 build also work. So it is really possible something with CPU supported commands?

PS: What's default CEF requirements for target CPU?

magreenblatt commented 8 years ago

Original comment by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


Looks like CEF build skia with AVX2 support. Intel Xeon X5560 doesn't support AVX or AVX2 (but support any other SSEs). So this is looks like a root of problem. Need find a way how to tweak build options.

magreenblatt commented 8 years ago

Original comment by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


Google Chrome Version 53.0.2785.116 m (64-bit) is work on target host, so it is looks like CEF-build specific. Official builds now built with 2015U3, and I'm tried to build with MSVS 2015 Update 3.1, and still got same result. May be building with 2015 Update 2 can help.

magreenblatt commented 8 years ago

Original comment by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


There is compiler issue (LTCG): 2015U3 produce result depending on object file ordering. If first object file contains AVX instruction set, then following objects also generate AVX instruction set, even if they had been compiled with lower instruction set. I.e. cl /ltcg sse2.obj avx.obj will produce correct result, but cl /ltcg avx.obj sse2.obj now produces incorrect result (images looks like works fine, but requires AVX).

magreenblatt commented 8 years ago

Original comment by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


VS2015U3.1 LTCG bug reproduction test case

magreenblatt commented 8 years ago

Original comment by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


Because i'm did not encounter in any problems with build 2785 branch with 2015 Update 3 except this, it is have sense to track down this problem deeper. As i'm say before, this is tied to possible bug? in LTCG.

vs2015u3-ltcg-bug-1.zip includes build.cmd script which should build avx-sse.exe and sse-avx.exe. This executable build from same obj modules, difference only in order of object files which is passed to linker.

Also *.disasm files generated to easy understand difference without touching debugger.

Tested with Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24215.1 for x64.

SSE-AVX: Correct case:


?use_sse@@YAXXZ:
  000000014000110C: 48 83 EC 38        sub         rsp,38h
  [0000000140001110 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001110): 0F 28 2D 19 DC 04  movaps      xmm5,xmmword ptr [__xmm@4080000040400000400000003f800000]
                    00
  [0000000140001117 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001117): 48 8D 0D DA DB 04  lea         rcx,[??_C@_0BC@POEDNAAP@SSE?3?5?$CFf?5?$CFf?5?$CFf?5?$CFf?6?$AA@]
                    00
  000000014000111E: 0F 28 C5           movaps      xmm0,xmm5
  [0000000140001121 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001121): 0F 57 E4           xorps       xmm4,xmm4
  [0000000140001124 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001124): 0F C6 C5 FF        shufps      xmm0,xmm5,0FFh
  [0000000140001128 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001128): 0F 28 CD           movaps      xmm1,xmm5
  000000014000112B: F3 0F 5A E0        cvtss2sd    xmm4,xmm0
  000000014000112F: 0F C6 CD AA        shufps      xmm1,xmm5,0AAh
  [0000000140001133 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001133): 0F 57 DB           xorps       xmm3,xmm3
  [0000000140001136 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001136): F3 0F 5A D9        cvtss2sd    xmm3,xmm1
  000000014000113A: 0F 28 C5           movaps      xmm0,xmm5
  000000014000113D: F2 0F 11 64 24 20  movsd       mmword ptr [rsp+20h],xmm4
  [0000000140001143 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001143): 0F C6 C5 55        shufps      xmm0,xmm5,55h
  [0000000140001147 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001147): 0F 57 D2           xorps       xmm2,xmm2
  000000014000114A: 0F 57 C9           xorps       xmm1,xmm1
  000000014000114D: 66 49 0F 7E D9     movd        r9,xmm3
  [0000000140001152 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001152): F3 0F 5A D0        cvtss2sd    xmm2,xmm0
  [0000000140001156 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001156): F3 0F 5A CD        cvtss2sd    xmm1,xmm5
  000000014000115A: 66 49 0F 7E D0     movd        r8,xmm2
  000000014000115F: 66 48 0F 7E CA     movd        rdx,xmm1
  [0000000140001164 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001164): E8 FB FE FF FF     call        printf
  [0000000140001169 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001169): 48 83 C4 38        add         rsp,38h
  000000014000116D: C3                 ret

AVX-SSE: incorrect case:

?use_sse@@YAXXZ:
  000000014000117C: 48 83 EC 48        sub         rsp,48h
  [0000000140001180 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001180): F3 0F 10 05 A8 DB  movss       xmm0,dword ptr [__real@40800000]
                    04 00
  [0000000140001188 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001188): 48 8D 4C 24 30     lea         rcx,[rsp+30h]
  000000014000118D: F3 0F 10 1D 97 DB  movss       xmm3,dword ptr [__real@40400000]
                    04 00
  [0000000140001195 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001195): F3 0F 10 15 8B DB  movss       xmm2,dword ptr [__real@40000000]
                    04 00
  000000014000119D: F3 0F 10 0D 7F DB  movss       xmm1,dword ptr [__real@3f800000]
                    04 00
  00000001400011A5: F3 0F 11 44 24 20  movss       dword ptr [rsp+20h],xmm0
  00000001400011AB: E8 08 FF FF FF     call        ??0Sk4f@@QEAA@MMMM@Z
  00000001400011B0: BA 03 00 00 00     mov         edx,3
  00000001400011B5: 48 8D 4C 24 30     lea         rcx,[rsp+30h]
  00000001400011BA: E8 19 FF FF FF     call        ??ASk4f@@QEBAMH@Z
  00000001400011BF: 0F 57 E4           xorps       xmm4,xmm4
  00000001400011C2: 48 8D 4C 24 30     lea         rcx,[rsp+30h]
  00000001400011C7: BA 02 00 00 00     mov         edx,2
  00000001400011CC: F3 0F 5A E0        cvtss2sd    xmm4,xmm0
  00000001400011D0: E8 03 FF FF FF     call        ??ASk4f@@QEBAMH@Z
  00000001400011D5: 48 8D 4C 24 30     lea         rcx,[rsp+30h]
  00000001400011DA: BA 01 00 00 00     mov         edx,1
  00000001400011DF: 0F 57 DB           xorps       xmm3,xmm3
  00000001400011E2: F3 0F 5A D8        cvtss2sd    xmm3,xmm0
  00000001400011E6: E8 ED FE FF FF     call        ??ASk4f@@QEBAMH@Z
  00000001400011EB: 48 8D 4C 24 30     lea         rcx,[rsp+30h]
  00000001400011F0: 33 D2              xor         edx,edx
  00000001400011F2: 0F 57 D2           xorps       xmm2,xmm2
  00000001400011F5: F3 0F 5A D0        cvtss2sd    xmm2,xmm0
  00000001400011F9: E8 DA FE FF FF     call        ??ASk4f@@QEBAMH@Z
  00000001400011FE: 48 8D 0D 0B DB 04  lea         rcx,[??_C@_0BC@POEDNAAP@SSE?3?5?$CFf?5?$CFf?5?$CFf?5?$CFf?6?$AA@]
                    00
  [0000000140001205 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001205): 0F 57 C9           xorps       xmm1,xmm1
  [0000000140001208 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001208): F2 0F 11 64 24 20  movsd       mmword ptr [rsp+20h],xmm4
  000000014000120E: F3 0F 5A C8        cvtss2sd    xmm1,xmm0
  [0000000140001212 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001212): 66 49 0F 7E D9     movd        r9,xmm3
  [0000000140001217 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001217): 66 49 0F 7E D0     movd        r8,xmm2
  000000014000121C: 66 48 0F 7E CA     movd        rdx,xmm1
  [0000000140001221 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001221): E8 3E FE FF FF     call        printf
  [0000000140001226 (bb)](https://bitbucket.org/chromiumembedded/cef/commits/0000000140001226): 48 83 C4 48        add         rsp,48h
  000000014000122A: C3                 ret

??0Sk4f@@QEAA@MMMM@Z:
  00000001400010B8: C5 F8 28 C1        vmovaps     xmm0,xmm1
  00000001400010BC: C4 E3 79 21 C2 10  vinsertps   xmm0,xmm0,xmm2,10h
  00000001400010C2: C4 E3 79 21 C3 20  vinsertps   xmm0,xmm0,xmm3,20h
  00000001400010C8: C4 E3 79 21 44 24  vinsertps   xmm0,xmm0,dword ptr [rsp+28h],30h
                    28 30
  00000001400010D0: C5 F8 11 01        vmovups     xmmword ptr [rcx],xmm0
  00000001400010D4: 48 8B C1           mov         rax,rcx
  00000001400010D7: C3                 ret

So, what's difference: in first case Sk4f constructor is completely inlined and it is holds only SSE instructions. In second case method body looks fine, but Sk4f constructor is not inlined. If we take a look on constructor code (listed above) - it is built with AVX instructions. So, now - our SSE-only code no more work on CPU's without AVX instruction set, and this completely depends on order of object files passed to linker.

Update: In CEF build i'm got crash exactly on Sk4f constructor, which looks very similar.

magreenblatt commented 8 years ago

Original comment by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


This script resort object files in libcef.ninja file. I'm built libcef with 2015U3 using this order and this looks like work (cefclient) runs on non-AVX host.

This script actually makes next files are last:

obj/third_party/libjpeg_turbo/simd_asm/jfdctflt-sse-64.o
obj/media/base/media_yasm/convert_yuv_to_rgb_sse.o
obj/media/base/media_yasm/linear_scale_yuv_to_rgb_sse.o
obj/media/base/media_yasm/scale_yuv_to_rgb_sse.o
obj/skia/skia_opts/SkBitmapFilter_opts_SSE2.obj
obj/skia/skia_opts/SkBitmapProcState_opts_SSE2.obj
obj/skia/skia_opts/SkBlitRow_opts_SSE2.obj
obj/third_party/libpng/libpng_sources/filter_sse2_intrinsics.obj
obj/third_party/libjpeg_turbo/simd_asm/jccolor-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jcgray-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jchuff-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jcsample-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jdcolor-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jdmerge-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jdsample-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jfdctfst-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jfdctint-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jidctflt-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jidctfst-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jidctint-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jidctred-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jquantf-sse2-64.o
obj/third_party/libjpeg_turbo/simd_asm/jquanti-sse2-64.o
obj/media/base/base/convert_rgb_to_yuv_sse2.obj
obj/media/base/base/filter_yuv_sse2.obj
obj/media/base/media_yasm/scale_yuv_to_rgb_sse2_x64.o
obj/third_party/libvpx/libvpx_yasm/copy_sse2.o
obj/third_party/libvpx/libvpx_yasm/idctllm_sse2.o
obj/third_party/libvpx/libvpx_yasm/iwalsh_sse2.o
obj/third_party/libvpx/libvpx_yasm/loopfilter_block_sse2_x86_64.o
obj/third_party/libvpx/libvpx_yasm/loopfilter_sse2.o
obj/third_party/libvpx/libvpx_yasm/mfqe_sse2.o
obj/third_party/libvpx/libvpx_yasm/postproc_sse2.o
obj/third_party/libvpx/libvpx_yasm/recon_sse2.o
obj/third_party/libvpx/libvpx_yasm/subpixel_sse2.o
obj/third_party/libvpx/libvpx_yasm/dct_sse2.o
obj/third_party/libvpx/libvpx_yasm/fwalsh_sse2.o
obj/third_party/libvpx/libvpx_yasm/vp9_mfqe_sse2.o
obj/third_party/libvpx/libvpx_yasm/vp9_postproc_sse2.o
obj/third_party/libvpx/libvpx_yasm/vp9_dct_sse2.o
obj/third_party/libvpx/libvpx_yasm/vp9_error_sse2.o
obj/third_party/libvpx/libvpx_yasm/vp9_temporal_filter_apply_sse2.o
obj/third_party/libvpx/libvpx_yasm/add_noise_sse2.o
obj/third_party/libvpx/libvpx_yasm/halfpix_variance_impl_sse2.o
obj/third_party/libvpx/libvpx_yasm/intrapred_sse2.o
obj/third_party/libvpx/libvpx_yasm/inv_wht_sse2.o
obj/third_party/libvpx/libvpx_yasm/sad4d_sse2.o
obj/third_party/libvpx/libvpx_yasm/sad_sse2.o
obj/third_party/libvpx/libvpx_yasm/subpel_variance_sse2.o
obj/third_party/libvpx/libvpx_yasm/subtract_sse2.o
obj/third_party/libvpx/libvpx_yasm/vpx_convolve_copy_sse2.o
obj/third_party/libvpx/libvpx_yasm/vpx_subpixel_8t_sse2.o
obj/third_party/libvpx/libvpx_yasm/vpx_subpixel_bilinear_sse2.o
obj/third_party/libwebp/libwebp_dsp_sse2/alpha_processing_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/argb_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/cost_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/dec_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/enc_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/filters_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/lossless_enc_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/lossless_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/rescaler_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/upsampling_sse2.obj
obj/third_party/libwebp/libwebp_dsp_sse2/yuv_sse2.obj
obj/third_party/qcms/qcms/transform-sse2.obj
obj/third_party/libvpx/libvpx_intrinsics_sse2.lib

obj/skia/skia_opts_sse3/SkBitmapProcState_opts_SSSE3.obj
obj/skia/skia_opts_sse3/SkOpts_ssse3.obj
obj/media/base/base/convert_rgb_to_yuv_ssse3.obj
obj/media/base/media_yasm/convert_rgb_to_yuv_ssse3.o
obj/third_party/libvpx/libvpx_yasm/copy_sse3.o
obj/third_party/libvpx/libvpx_yasm/subpixel_ssse3.o
obj/third_party/libvpx/libvpx_yasm/vp9_quantize_ssse3_x86_64.o
obj/third_party/libvpx/libvpx_yasm/avg_ssse3_x86_64.o
obj/third_party/libvpx/libvpx_yasm/fwd_txfm_ssse3_x86_64.o
obj/third_party/libvpx/libvpx_yasm/intrapred_ssse3.o
obj/third_party/libvpx/libvpx_yasm/inv_txfm_ssse3_x86_64.o
obj/third_party/libvpx/libvpx_yasm/quantize_ssse3_x86_64.o
obj/third_party/libvpx/libvpx_yasm/sad_sse3.o
obj/third_party/libvpx/libvpx_yasm/sad_ssse3.o
obj/third_party/libvpx/libvpx_yasm/vpx_subpixel_8t_ssse3.o
obj/third_party/libvpx/libvpx_yasm/vpx_subpixel_bilinear_ssse3.o
obj/third_party/libvpx/libvpx_intrinsics_ssse3.lib
obj/skia/skia_opts_sse41/SkOpts_sse41.obj
obj/skia/skia_opts_sse42/SkForceCPlusPlusLinking.obj
obj/third_party/libvpx/libvpx_yasm/sad_sse4.o
obj/third_party/libwebp/libwebp_dsp_sse41/alpha_processing_sse41.obj
obj/third_party/libwebp/libwebp_dsp_sse41/dec_sse41.obj
obj/third_party/libwebp/libwebp_dsp_sse41/enc_sse41.obj
obj/third_party/libwebp/libwebp_dsp_sse41/lossless_enc_sse41.obj
obj/third_party/libvpx/libvpx_intrinsics_sse4_1.lib
obj/skia/skia_opts_avx/SkOpts_avx.obj
obj/third_party/libvpx/libvpx_yasm/quantize_avx_x86_64.o
obj/third_party/libvpx/libvpx_intrinsics_avx.lib
obj/skia/skia_opts_avx2/SkForceCPlusPlusLinking.obj
obj/third_party/boringssl/boringssl_asm/rsaz-avx2.o
obj/third_party/libwebp/libwebp_dsp/enc_avx2.obj
obj/third_party/libvpx/libvpx_intrinsics_avx2.lib 
magreenblatt commented 8 years ago

To clarify, this bug is not triggered if the SSE object files are included before the AVX object files.

As Dmitry describes above, he created a build after ordering the list of files in obj/cef/libcef.ninja. The bug was not triggered when the obj files were ordered as: generic, sse, sse2, sse3, sse4, avx, avx2.

We think this bug is not triggered in Chrome either because chrome uses PGO, or because the chrome ninja files just happen to include sse first. Chrome versions that currently build with Update 3 are canary and master.

magreenblatt commented 8 years ago

Original comment by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


I'm add issue on Microsoft Connect MSVC 19.00.24215.1 generates wrong SSE/AVX mixed code with LTCG

magreenblatt commented 8 years ago

Issue #1998 was marked as a duplicate of this issue.

magreenblatt commented 8 years ago

Related Chromium issue: https://bugs.chromium.org/p/chromium/issues/detail?id=654213

magreenblatt commented 7 years ago

Workaround added in 2840 branch revision 175be9a (bb), 2883 branch revision 3a77b24 (bb) and master branch revision f7a4102 (bb).

magreenblatt commented 7 years ago

Original comment by amaitland (Bitbucket: amaitland, GitHub: amaitland).


Testing with 3.2840.1493 (before this change had been applied) and the computer I was previously having issues with was working perfectly. Looked like the issue was resolved in Chromium, just mentioning as an FYI.

magreenblatt commented 7 years ago

Original comment by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


In chromium they only workaround same by apply some forced inlines in skia, but this workaround produces stable and still efficient result on current compiler without changes sources for whole codebase without inspecting it. Once C++ compliant (program-wide ODR-violation-free) implementations will be provided by chromium (third party libs mainly) it is safe to disable it.

magreenblatt commented 8 years ago

Original changes by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


magreenblatt commented 8 years ago

Original changes by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


magreenblatt commented 8 years ago

Original changes by Dmitry Azaraev (Bitbucket: dmitry-azaraev, GitHub: dmitry-azaraev).


magreenblatt commented 8 years ago
magreenblatt commented 7 years ago