connormanning / entwine

Entwine - point cloud organization for massive datasets
https://entwine.io
Other
451 stars 128 forks source link

Segmentation fault during scanning #220

Closed spolloni closed 1 year ago

spolloni commented 4 years ago

In attempting to build a large amount (~40K) of laz files, I am encountering segmentation faults at varying moments during the scanning step. The exact command I'm using is:

entwine build -i $(cat tiles_3443.txt)  -o s3://bucket-name/build_name  --srs EPSG:3443 --threads 16

where tiles_3443.txt is a list of S3 file keys. The build is run from a Linux (18.04.3 LTS Bionic Beaver) machine in AWS EC2. After looking at #218, I proceeded to split tiles_3443.txt into successively smaller subsets in hope of finding an offending corrupt file.

Strangely, it seems the segfaults are not deterministic ( I wonder if this has to do with the read from S3?). Out of the ~40K files I am trying to build, I identified a handful that will trigger the segfault on occasion even when built individually. For example:

ubuntu@hostname:~$ entwine build -i s3://source-bucket/file.laz   -o s3://destination-bucket/build_name   --srs EPSG:3343 --threads 16
Scanning input
1/1: s3://source-bucket/file.laz
Segmentation fault (core dumped)

here is a traceback obtained with gdb against the core dump:

``` [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `entwine build -i s3://source-bucket/file.laz'. Program terminated with signal SIGSEGV, Segmentation fault. #0 std::char_traits::copy (__s1=0x7f56a002cc78 "\340\006", __s2=0x4 , __n=2064) at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1578638331887/work/.build/x86_64-conda_cos6-linux-gnu/build/build-cc-gcc-final/x86_64-conda_cos6-linux-gnu/libstdc++-v3/include/bits/char_traits.h:365 365 /home/conda/feedstock_root/build_artifacts/ctng-compilers_1578638331887/work/.build/x86_64-conda_cos6-linux-gnu/build/build-cc-gcc-final/x86_64-conda_cos6-linux-gnu/libstdc++-v3/include/bits/char_traits.h: No such file or directory. [Current thread is 1 (Thread 0x7f56b7533700 (LWP 30639))] (gdb) (gdb) (gdb) trace Tracepoint 1 at 0x7f56bed65440: file /home/conda/feedstock_root/build_artifacts/ctng-compilers_1578638331887/work/.build/x86_64-conda_cos6-linux-gnu/build/build-cc-gcc-final/x86_64-conda_cos6-linux-gnu/libstdc++-v3/include/bits/char_traits.h, line 365. (gdb) (gdb) (gdb) bt #0 std::char_traits::copy (__s1=0x7f56a002cc78 "\340\006", __s2=0x4 , __n=2064) at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1578638331887/work/.build/x86_64-conda_cos6-linux-gnu/build/build-cc-gcc-final/x86_64-conda_cos6-linux-gnu/libstdc++-v3/include/bits/char_traits.h:365 #1 0x00007f56bed75638 in std::__cxx11::basic_string, std::allocator >::_S_copy (__d=__d@entry=0x7f56a002cc78 "\340\006", __s=__s@entry=0x4 , __n=__n@entry=2064) at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1578638331887/work/.build/x86_64-conda_cos6-linux-gnu/build/build-cc-gcc-final/x86_64-conda_cos6-linux-gnu/libstdc++-v3/include/bits/char_traits.h:300 #2 0x00007f56bed75813 in std::__cxx11::basic_string, std::allocator >::_M_mutate (this=this@entry=0x7f56b7531410, __pos=__pos@entry=8, __len1=__len1@entry=1, __s=__s@entry=0x4 , __len2=__len2@entry=2064) at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1578638331887/work/.build/x86_64-conda_cos6-linux-gnu/build/build-cc-gcc-final/x86_64-conda_cos6-linux-gnu/libstdc++-v3/include/bits/basic_string.h:186 #3 0x00007f56bed7610b in std::__cxx11::basic_string, std::allocator >::_M_replace (this=0x7f56b7531410, __pos=8, __len1=1, __s=0x4 , __len2=2064) at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1578638331887/work/.build/x86_64-conda_cos6-linux-gnu/build/build-cc-gcc-final/x86_64-conda_cos6-linux-gnu/libstdc++-v3/include/bits/basic_string.h:993 #4 0x00007f56bf93c777 in pdal::Utils::escapeJSON(std::__cxx11::basic_string, std::allocator > const&) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/./libpdal_util.so.9 #5 0x00007f56bf6c56b1 in pdal::MetadataNode::jsonValue[abi:cxx11]() const () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libpdal_base.so.9 #6 0x00007f56bf6c7cf9 in pdal::(anonymous namespace)::toJSON(pdal::MetadataNode const&, std::ostream&, int) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libpdal_base.so.9 #7 0x00007f56bf6c63eb in pdal::(anonymous namespace)::subnodesToJSON(pdal::MetadataNode const&, std::ostream&, int) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libpdal_base.so.9 #8 0x00007f56bf6c7dbf in pdal::(anonymous namespace)::toJSON(pdal::MetadataNode const&, std::ostream&, int) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libpdal_base.so.9 #9 0x00007f56bf6c63eb in pdal::(anonymous namespace)::subnodesToJSON(pdal::MetadataNode const&, std::ostream&, int) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libpdal_base.so.9 #10 0x00007f56bf6c834d in pdal::Utils::toJSON(pdal::MetadataNode const&, std::ostream&) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libpdal_base.so.9 #11 0x00007f56bf6c88f5 in pdal::Utils::toJSON[abi:cxx11](pdal::MetadataNode const&) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libpdal_base.so.9 #12 0x00007f56bfa82a31 in entwine::ScanInfo::ScanInfo(pdal::Stage&, pdal::QuickInfo const&) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libentwine.so.2 #13 0x00007f56bfa8304d in entwine::ScanInfo::create(pdal::Stage&) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libentwine.so.2 #14 0x00007f56bfa877db in entwine::Executor::preview(nlohmann::basic_json, std::allocator >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer>, bool) const () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libentwine.so.2 #15 0x00007f56bfa5a230 in entwine::Scan::add(entwine::FileInfo&, std::__cxx11::basic_string, std::allocator >) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libentwine.so.2 #16 0x00007f56bfa5be0b in entwine::Scan::addRanged(entwine::FileInfo&) () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libentwine.so.2 #17 0x00007f56bfa5c24c in entwine::Scan::add(entwine::FileInfo&)::{lambda()#1}::operator()() const () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libentwine.so.2 #18 0x00007f56bf9a7e4c in std::thread::_State_impl > >::_M_run() () from /home/ubuntu/miniconda3/envs/entwine/bin/../lib/libentwine.so.2 #19 0x00007f56bed42163 in std::execute_native_thread_routine (__p=0x5629c4e97410) at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1578638331887/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libstdc++-v3/src/c++11/thread.cc:80 #20 0x00007f56bedf56db in start_thread (arg=0x7f56b7533700) at pthread_create.c:463 #21 0x00007f56be9aa88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 ```

any clue on what might be causing this?

connormanning commented 4 years ago

I'm not sure, I haven't seen this behavior and I do use S3 heavily with Entwine. Can you maybe send me a key pair with read access to this data via email? Otherwise I'm not sure how to reproduce this.

spolloni commented 4 years ago

just sent those over. By the way, I managed to avoid this problem and successfully scan all the files using the entwine docker container.

hobu commented 4 years ago

Was your failing build using Conda, or was it self built?

spolloni commented 4 years ago

using Conda. probably should have mentioned this earlier.

connormanning commented 1 year ago

I don't believe this is an issue anymore, a new ticket can be opened with a reproduction case if so.