dthuerck / mapmap_cpu

A high-performance general-purpose MRF MAP solver, heavily exploiting SIMD instructions.
BSD 3-Clause "New" or "Revised" License
102 stars 51 forks source link

crash when used as face-view labeling #22

Open cdcseacave opened 4 years ago

cdcseacave commented 4 years ago

I tried to use it as demonstrated in mvs_texturing, but for some meshes it crashes:

image

I can send the mesh, but I do not see an easy way to dump the mapmap problem to a file to send you the problem.

xubury commented 4 years ago

Hey, I ran into the exact same problem as yours. Did you work it out eventually?

xubury commented 4 years ago

but for some meshes it crashes:

For me, I didn't find out a single working mesh...

dthuerck commented 3 years ago

Hi there,

I'd be happy tro investigate that. If you send me the mesh and instructions to replicate, I'll take a shot at it (at least under Linux).

xubury commented 3 years ago

Hi there,

I'd be happy tro investigate that. If you send me the mesh and instructions to replicate, I'll take a shot at it (at least under Linux).

Hi, this is one of my meshes that is keep getting the segmentation fault. I have tried it on Linux(Ubuntu) and it works fine on it. https://drive.google.com/file/d/1nJ9ZDvfvBihSqb-oXHONyZC1fqiuvODk/view?usp=sharing I tried to debug it for serveral days now. I suspect it's related to some array alignment problems like the link below. https://stackoverflow.com/questions/62324009/mm256-load-ps-cause-segmentation-fault-with-google-benchmark-in-debug-mode BTW, I am using MinGW-w64 GCC (10.2.0) as compiler on windows10.

xubury commented 3 years ago

actually, you may not need the mesh. I ran into the same issue in mapmap_demo or mapmap_test. It always crashed here. QQ截图20200731095243 I can't figure out why this can trigger a segmentation fault. After some tries, I changed the line a little bit.

diff --git a/mapmap/source/optimizer_instances/dp_node_solver.impl.h b/mapmap/source/optimizer_instances/dp_node_solver.impl.h
index 63498f8..5f82c87 100644
--- a/mapmap/source/optimizer_instances/dp_node_solver.impl.h
+++ b/mapmap/source/optimizer_instances/dp_node_solver.impl.h
@@ -246,7 +246,8 @@ get_independent_of_parent_costs(
         label_from_offset(node_id, 0);

     /* vector holding costs */
-    _v_t<COSTTYPE, SIMDWIDTH> cost = v_init<COSTTYPE, SIMDWIDTH>();
+    _s_t<COSTTYPE, SIMDWIDTH> test = 0;
+    _v_t<COSTTYPE, SIMDWIDTH> cost = v_init<COSTTYPE, SIMDWIDTH>(test);

     /* retrieve label vector for this offset */
     _iv_t<COSTTYPE, SIMDWIDTH> l = m_node->c_labels->

Then I not longer received the segmentation fault in that line. Instead, I crashed in the few lines below.

    cost = c_unary->supports_enumerable_costs() ?
        v_add<COSTTYPE, SIMDWIDTH>(cost,
        c_unary->get_unary_costs_enum_offset(l_i)) :
        v_add<COSTTYPE, SIMDWIDTH>(cost, c_unary->get_unary_costs(l));

U{L5ATC}BG0HFI8L 8IK6Q7 QQ截图20200731145542

dthuerck commented 3 years ago

Interesting - so far, I seem to be unable to reproduce it. Can you name me the CPU you're using so I can narrow this down to the right instruction set (AVX/2, SSE, ...)?

dthuerck commented 3 years ago

One further thing to try would be edit all v_load variants in source/vector_math.impl.h so they always use the unaligned intrinsics (loadu). I'll see if I can get my hands on a windows machine to test that.

xubury commented 3 years ago

Interesting - so far, I seem to be unable to reproduce it. Can you name me the CPU you're using so I can narrow this down to the right instruction set (AVX/2, SSE, ...)?

My CPU is Intel(R) Core(TM) i3-8100 CPU @ 3.60GHz (4 CPUs), ~3.6GHz

xubury commented 3 years ago

One further thing to try would be edit all v_load variants in source/vector_math.impl.h so they always use the unaligned intrinsics (loadu). I'll see if I can get my hands on a windows machine to test that.

Just tested it, no luck though.

xubury commented 3 years ago

Surprisingly, I tested mapmap_demo compiled by MVSC and NO SegFault! Though as https://github.com/dthuerck/mapmap_cpu/issues/21 mention, I have to change the line in prev_level(). I was using MinGW-GCC 10.2.0 as compiler, maybe the tbb provieded by msys2's package manager has some issues. I will try to rebuild it to see if the problem still exsits.

Update: I just rebuild tbb with MinGW-GCC and still get the segmentation fault...

xubury commented 3 years ago

I can confirm that my bug is related to the compiler, there've been serval discussion about this bug. https://stackoverflow.com/questions/30928265/mingw64-is-incapable-of-32-byte-stack-alignment-required-for-avx-on-windows-x64 https://sourceforge.net/p/mingw-w64/mailman/message/34485783/

dthuerck commented 3 years ago

Ah, too bad, it's a compiler bug. Are you able to switch to VS? I'll push the change in #21 upstream.

Regarding this problem, I only see two possibilities going forward:

  1. Avoid MinGW ;)
  2. Move variables from stack to heap.
xubury commented 3 years ago

Ah, too bad, it's a compiler bug. Are you able to switch to VS? I'll push the change in #21 upstream.

Regarding this problem, I only see two possibilities going forward:

1. Avoid MinGW ;)

2. Move variables from stack to heap.

Yeah, but I was able to avoid the problem by setting simd width to 4 using mingw. I don't know what it will do to the final outcome though.

melhashash commented 1 year ago

@xubury @dthuerck I am facing the same issue reported here by @cdcseacave. I packed the problem to a bin file in this issue https://github.com/dthuerck/mapmap_cpu/issues/37#issuecomment-1218231886

melhashash commented 1 year ago

I think I might have found the reason, looks like the "labels" within the "label_set" are assumed to be sorted. If not, bad things happen. Correct?

melhashash commented 1 year ago

Solved by making sure labels are sorted.

cc: @cdcseacave in case you are still interested in trying it again.

dthuerck commented 1 year ago

Thanks for investigating!