jdryg / vg-renderer

A vector graphics renderer for bgfx, based on ideas from NanoVG and ImDrawList (Dear ImGUI)
BSD 2-Clause "Simplified" License
503 stars 55 forks source link

Segmentation fault in stroker.cpp vg::addPosColor #10

Closed hugoam closed 6 years ago

hugoam commented 6 years ago

A few people have been reporting a segmentation faults when using toy/mud, on some specific platforms, happening inside stroker.cpp. https://github.com/hugoam/mud-sample/issues/1 This error dates from before you made some changes to that code recently, but I also have a more recent stacktrace from after your recent changes to stroker.cpp (see after). The first one seems like a Ubuntu 16.04 64 bits build, and the second is a macOS 64 bits build.

It's hard for me to guess what's going wrong (there's a looot of potentially unsafe manual pointer operations in there), but my first feeling is that for something like that to happen, while mud otherwise works fine on other platforms/architecture, it's probably a problem internal to vg-renderer and not an issue in the calling code. But it's hard to know for sure though, so I'd be happy to get your input on this :)

Do you know how we could investigate why this error is happening ?

2018-08-10 01:13:02.878388-0400 00_cube_d[38160:1599409] ../../../mud/3rdparty/bgfx/src/renderer_gl.cpp (4439): BGFX attr a_texcoord0: 1 Process 38160 stopped * thread #1, name = 'bgfx - renderer backend thread', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) frame #0: 0x000000010183cc0f 00_cube_d`void vg::addPosColor<3u>(stroker=0x0000000103347e80, srcPos=0x00007ffeefbfe070, srcColor=0x00007ffeefbfe0f0) at stroker.cpp:2560
   2557                 *dstPos128 = *srcPos128; // x0, y0, x1, y1
   2558                 *dstColor64 = *srcColor64; // c0, c1
   2559         } else if (N == 3) {
-> 2560                 *dstPos128 = *srcPos128; // x0, y0, x1, y1
   2561                 dstPos64[2] = srcPos64[2]; // x2, y2
   2562                 *dstColor64 = *srcColor64; // c0, c1
   2563                 dstColor[2] = srcColor[2]; // c2
Target 0: (00_cube_d) stopped.
(lldb) bt
* thread #1, name = 'bgfx - renderer backend thread', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x000000010183cc0f 00_cube_d`void vg::addPosColor<3u>(stroker=0x0000000103347e80, srcPos=0x00007ffeefbfe070, srcColor=0x00007ffeefbfe0f0) at stroker.cpp:2560 frame #1: 0x000000010183546e 00_cube_d`void vg::polylineStrokeAAThin<(vg::LineCap::Enum)0, (vg::LineJoin::Enum)0>(stroker=0x0000000103347e80, mesh=0x00007ffeefbfe1c8, vtx=0x0000000103618500, numPathVertices=4, color=4281611316, closed=true) at stroker.cpp:2332
    frame #2: 0x0000000101834b8e 00_cube_d`vg::strokerPolylineStrokeAAThin(stroker=0x0000000103347e80, mesh=0x00007ffeefbfe1c8, vertexList=0x0000000103618500, numPathVertices=4, isClosed=true, color=4281611316, lineCap=Butt, lineJoin=Miter) at stroker.cpp:319 frame #3: 0x000000010184cf12 00_cube_d`vg::ctxStrokePathColor(ctx=0x0000000109b80000, color=4281611316, width=1, flags=16) at vg.cpp:3213
    frame #4: 0x0000000101850bdb 00_cube_d`vg::ctxSubmitCommandList(ctx=0x0000000109b80000, handle=(idx = 1)) at vg.cpp:4191 frame #5: 0x000000010184539e 00_cube_d`vg::submitCommandList(ctx=0x0000000109b80000, handle=(idx = 1)) at vg.cpp:1716
    frame #6: 0x0000000100b9ce05 00_cube_d`mud::VgVg::draw_layer(this=0x00000001033144a0, layer=0x000000011e620cc0, position=0x00000001019bca98, scale=1)0> const&, float) at VgVg.cpp:374 frame #7: 0x0000000100c6d648 00_cube_d`mud::Typed<std::__1::vector<mud::Index*, std::__1::allocator<mud::Index*> > >::type() - 18446744069401553335
    frame #8: 0x0000000100c6d5ed 00_cube_d`mud::Typed<std::__1::vector<mud::Index*, std::__1::allocator<mud::Index*> > >::type() - 18446744069401553426 frame #9: 0x0000000100c6d579 00_cube_d`mud::Typed<std::__1::vector<mud::Index*, std::__1::allocator<mud::Index*> > >::type() - 18446744069401553542
    frame #10: 0x0000000100c41158 00_cube_d`std::__1::function<void (mud::Layer&)>::operator(this=0x00007ffeefbfea40, __arg=0x000000011e620cc0)(mud::Layer&) const at functional:1916 frame #11: 0x0000000100c40f63 00_cube_d`mud::Layer::visit(this=0x000000011e620cc0, visitor=0x00007ffeefbfea40)> const&) at Layer.cpp:72
    frame #12: 0x0000000100c410e6 00_cube_d`mud::Layer::visit(this=0x000000010581a300, visitor=0x00007ffeefbfea40)> const&) at Layer.cpp:75 frame #13: 0x0000000100c65a0a 00_cube_d`mud::Typed<std::__1::vector<mud::Index*, std::__1::allocator<mud::Index*> > >::type() [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string(this="\fb\x1e\x01\0\0\0e\x01, __s="p7a\x1e\x01") - 18446744069401585141
    frame #14: 0x0000000100cb10e2 00_cube_d`mud::UiWindow::render_frame(this=0x00000001033467a0) at UiWindow.cpp:157 frame #15: 0x00000001000d41b4 00_cube_d`mud::Shell::pump(this=0x00007ffeefbfec40) at Shell.cpp:85
    frame #16: 0x00000001000d4044 00_cube_d`mud::Shell::run(this=0x00007ffeefbfec40, func=0x00007ffeefbfec10, iterations=0)> const&, unsigned long) at Shell.cpp:77 frame #17: 0x000000010008e667 00_cube_d`main(argc=1, argv=0x00007ffeefbff6e0) at 00_cube.cpp:57
hugoam commented 6 years ago

Some update on the investigation from the user that had the second stacktrace : setting VG_CONFIG_ENABLE_SIMD=0 fixes the problem. So it's most probably an issue in vg SIMD implementation ?

jdryg commented 6 years ago

Thanks for the report.

Short answer: Yes, this code seems to be problematic on GCC/Clang (tested on godbolt) because assigning an m128 to an m128 produces an aligned store and the memory isn't guaranteed to be aligned at that point. MSVC seems to always produce unaligned loads/stores that's why I haven't seen the bug.

Long answer: This code is my (apparently failed) attempt to convice the compiler to avoid emmiting memcpy's for small number of vertices. I think the main problem is bx::memCopy() which cannot be turned into an inlined memcpy. I either have to fix the __m128 loads/stores to explicetely use _mm_loadu_ps/_mm_storeu_ps or replace bx::memCopy with memcpy hoping it will be inlined by the compiler in the small cases of N.

Will try to fix asap and report back for further testing on your side.

jdryg commented 6 years ago

Decided to replace bx::memCopy with the compiler's memcpy and removed the special cases from all the related functions (addPos(), addPosColor() and addIndices()).

Please check the latest code and report back as soon as you find the time.

For the record, I tested it with my game with no noticable performance difference, so I don't know what I saw that made me write that code in the first place :) Most probably I saw calls to memcpy/bx::memMove/MoveSmall in the profiler and I tried to eliminate them.

hugoam commented 6 years ago

Thanks for the quick fix :) I'll report back, but since the memcpy path (with VG_CONFIG_ENABLE_SIMD=0) apparently wasn't failing, the fix should work !

hugoam commented 6 years ago

I think we can close this issue now