Closed stephaneburel-cea closed 3 years ago
Hello, I cannot reproduce this issue with the latest version of N2D2. Could you check the latest commit that you used?
Cheers, Olivier
Hello. Sadly the problem persist even with the latest commit. Regards, Stéphane Burel
Could you send me (or give me the location of) the full generated export_CPP_int8 folder that causes the segfault?
I have the same problem, the -O3 or -O2 version crashes when run from the terminal. strace gives
clone(child_stack=0x7f9f3345df30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f9f3345e9d0, tls=0x7f9f3345e700, child_tidptr=0x7f9f3345e9d0) = 2492
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f9f3245d000
mprotect(0x7f9f3245d000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f9f32c5cf30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f9f32c5d9d0, tls=0x7f9f32c5d700, child_tidptr=0x7f9f32c5d9d0) = 2493
futex(0x26cb5a4, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x26cbfc4, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x26cb5a4, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x26cb5a4, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x26cb5a4, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x26cb5a4, FUTEX_WAKE_PRIVATE, 2147483647) = ? <unavailable>
There are 7 clone syscalls (on a 8 core machine) so i suspect the crash is immediately after starting openmp threads.
It is very tricky since when running the release version under gdb there is no crash, only 0% correct classification . When compiling without optimization and in debug (-O0 -g) it works fine.
It seems like the total memory reported by the export's memory manager is wrong in some cases.
In the generated _export_CPPint8/src/NetworkPropagate.cpp, can you check the value of the #define MEMORY_SIZE
?
It should be 1017856 for this network. If it is less, you encounter the same issue as @stephaneburel-cea.
Sorry, I forgot to mention.. I have this problem on another network (1D sound) in float32 export.
I compiled without openmp and without march=native, with -O2 and -g. I dumped the core and loaded it with gdb. The error seems to be in
#0 N2D2::Network::poolcellPropagate<32, 1, 1990, 32, 1, 995, 0, 0, 1, 2, 1, 2, (Pooling_T)0, (ActivationFunction_T)7, 71768, 56072, 0, 7608, 32, 7608, 31840, 0, 0, 32, float, float> (
outputs=0x25158e0 <mem+30432>, inputs=0x2554360 <mem+287072>, this=<optimized out>)
at ./include/Network.hpp:988
988 if (inputs[iOffset + output + sx * INPUT_MEM_STRIDE]
(gdb) bt
#0 N2D2::Network::poolcellPropagate<32, 1, 1990, 32, 1, 995, 0, 0, 1, 2, 1, 2, (Pooling_T)0, (ActivationFunction_T)7, 71768, 56072, 0, 7608, 32, 7608, 31840, 0, 0, 32, float, float> (
outputs=0x25158e0 <mem+30432>, inputs=0x2554360 <mem+287072>, this=<optimized out>)
at ./include/Network.hpp:988
#1 N2D2::Network::propagate<float> (this=<optimized out>, inputs=<optimized out>,
outputs=0x3769950) at src/NetworkPropagate.cpp:275
Hi, could you send me the exported project? There is probably an issue with the memory manager, which is the most tricky part of the export...
It seems as you mention that the Memory size is computed incorrectly. It seems around 100 bytes less than needed which explains the random crashing as without optimization most likely the memory allocations are somehow padded in the heap or the area around the allocated buffer is not used. I'll send you the exported project.
I think I have a fix for your issue, which is due to a bad handling of memory wrapping. Here is a patch that you can try. N2D2 needs to be patched, recompiled and the export must be regenerated.
This issue is not related to the initial issue of this post.
Actually, I finally pushed the patch in the latest commit. @andreistoian please use the new issue I created for your specific issue: #84
No activity, closing this issue, which may be solved by the latest commits. Please re-open if the issue still arise in the latest version.
Hello. I come to report a bug with the exported CPP model. Given ResNet_ONNX.ini available on the repository, and given the ONNX data available on https://s3.amazonaws.com/onnx-model-zoo/resnet/resnet18v1/resnet18v1.onnx The following command export to CPP the model : n2d2.sh "$N2D2_MODELS/ResNet_ONNX.ini" -seed 1 -w /dev/null -export CPP -calib -1 -nbbits 8 -act-rescaling-mode Floating-point -no-unsigned The export is succesfull and so is the exported model compilation.
But a wild segfault appears with the execution : ./run_export
Give a Segmentation fault (core dumped)
Best regards, Stéphane Burel