uav4geo / OpenPointClass

Fast and memory efficient semantic segmentation of 3D point clouds. Runs on Windows, Mac and Linux.
GNU Affero General Public License v3.0
121 stars 17 forks source link

Segmentation fault #2

Closed HeDo88TH closed 1 year ago

HeDo88TH commented 1 year ago

Input files: https://hub.dronedb.app/r/hedo88/err-opc

Command: ./pcclassify ../point_cloud.ply out.ply ../model.bin

Loading ../model.bin
Reading 1060056 points
Starting resolution: 0.2
Init scale 0 at 0.2 ...
Init scale 1 at 0.2 ...
Init scale 3 at 0.8 ...
Init scale 5 at 3.2 ...
Init scale 4 at 1.6 ...
Init scale 2 at 0.4 ...
Building scale 1 (160105 points) ...
Building scale 2 (37145 points) ...
Building scale 3 (10084 points) ...
Building scale 4 (2694 points) ...
Building scale 5 (731 points) ...
Features: 105
Classifying...
Local smoothing...
Segmentation fault

with valgrind:

Classifying...
Local smoothing...
==656== Thread 3:
==656== Invalid write of size 1
==656==    at 0x17F432: rf::classify(PointSet&, liblearning::RandomForest::RandomForest<liblearning::RandomForest::NodeGini<liblearning::RandomForest::AxisAlignedSplitter> >*, std::vector<Feature*, std::allocator<Feature*> > const&, std::vector<Label, std::allocator<Label> > const&, rf::Regularization, bool, bool) [clone ._omp_fn.3] (in /build/pcclassify)
==656==    by 0x487CB9D: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==656==    by 0x5396B42: start_thread (pthread_create.c:442)
==656==    by 0x5427BB3: clone (clone.S:100)
==656==  Address 0x15912 is not stack'd, malloc'd or (recently) free'd
==656== 
==656== 
==656== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==656==  Access not within mapped region at address 0x15912
==656==    at 0x17F432: rf::classify(PointSet&, liblearning::RandomForest::RandomForest<liblearning::RandomForest::NodeGini<liblearning::RandomForest::AxisAlignedSplitter> >*, std::vector<Feature*, std::allocator<Feature*> > const&, std::vector<Label, std::allocator<Label> > const&, rf::Regularization, bool, bool) [clone ._omp_fn.3] (in /build/pcclassify)
==656==    by 0x487CB9D: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==656==    by 0x5396B42: start_thread (pthread_create.c:442)
==656==    by 0x5427BB3: clone (clone.S:100)
==656==  If you believe this happened as a result of a stack
==656==  overflow in your program's main thread (unlikely but
==656==  possible), you can try to increase the size of the
==656==  main thread stack using the --main-stacksize= flag.
==656==  The main thread stack size used in this run was 8388608.
==656== 
==656== HEAP SUMMARY:
==656==     in use at exit: 205,653,345 bytes in 1,223,477 blocks
==656==   total heap usage: 16,633,178 allocs, 15,409,701 frees, 1,342,197,731 bytes allocated
==656== 
==656== Searching for pointers to 1,223,477 not-freed blocks
==656== Checked 399,635,328 bytes
==656== 
==656== LEAK SUMMARY:
==656==    definitely lost: 415 bytes in 3 blocks
==656==    indirectly lost: 0 bytes in 0 blocks
==656==      possibly lost: 11,408 bytes in 23 blocks
==656==    still reachable: 205,641,522 bytes in 1,223,451 blocks
==656==         suppressed: 0 bytes in 0 blocks
==656== Rerun with --leak-check=full to see details of leaked memory
==656== 
==656== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==656== 
==656== 1 errors in context 1 of 1:
==656== Invalid write of size 1
==656==    at 0x17F432: rf::classify(PointSet&, liblearning::RandomForest::RandomForest<liblearning::RandomForest::NodeGini<liblearning::RandomForest::AxisAlignedSplitter> >*, std::vector<Feature*, std::allocator<Feature*> > const&, std::vector<Label, std::allocator<Label> > const&, rf::Regularization, bool, bool) [clone ._omp_fn.3] (in /build/pcclassify)
==656==    by 0x487CB9D: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==656==    by 0x5396B42: start_thread (pthread_create.c:442)
==656==    by 0x5427BB3: clone (clone.S:100)
==656==  Address 0x15912 is not stack'd, malloc'd or (recently) free'd
==656== 
==656== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

It crashes both on my linux system (kde neon 5.27) and on linux docker image (ubuntu 22.04)

pierotofy commented 1 year ago

Maybe a race condition?

(odmdev)[piero:/datasets … ssify/build] main* 9s ± ./pcclassify hedotest/point_cloud.ply hedotest/output.ply hedotest/model.bin --color
Loading hedotest/model.bin
Reading 1060056 points
Starting resolution: 0.2
Init scale 0 at 0.2 ...
Init scale 1 at 0.2 ...
Init scale 3 at 0.8 ...
Init scale 5 at 3.2 ...
Init scale 4 at 1.6 ...
Init scale 2 at 0.4 ...
Building scale 1 (160105 points) ...
Building scale 2 (37145 points) ...
Building scale 3 (10084 points) ...
Building scale 4 (2694 points) ...
Building scale 5 (731 points) ...
Features: 105
Classifying...
Local smoothing...
Wrote hedotest/output.ply

https://hub.dronedb.app/r/hedo88/err-opc/view/b3V0cHV0LnBseQ==/pointcloud

Does it work with:

OMP_NUM_THREADS=1 ./pcclassify ../point_cloud.ply out.ply ../model.bin ?

pierotofy commented 1 year ago

Ok this should be fixed with https://github.com/uav4geo/OpenPointClass/commit/2ccbc764e02cbb79da5908b9311a93f0a34f6dac

A vector was not being allocated (it would work only with the --color flag).