Open themightyoarfish opened 1 year ago
Hm, no idea. Do you do anything with PointXYZL
that could lead to the mixup? Could this be a problem with clang's ASAN or with clang itself? Or the Apple versions, respectively? I checked the relevant places in the PCL code, and I can't find anything suspicious.
Not once in my life have i used PointXYZL
… hence my confusion. Maybe it's asan-related (seems most plausible explanation to me), but the crash exists also without it, it's just unclear then what exactly fails.
However is it not strange then that defining PCL_NO_PRECOMPILE makes it go away?
Maybe you can try using valgrind instead of compiling with ASAN, to see if you still see PointXYZL
in the report.
Regarding the crash itself: Which flags do you use when building PCL and which flags when building your own project? (SSE and AVX flags, and everything else that could be alignment-related). Usually these types of errors happen if these flags don't match
good idea, although so far I think I've only seen it on macos where valgrind is unavailable.
PCL is compiled from source (partly for this reason) and uses -march=native
(or -mtune=native
on arm) just like my client code.
It's also weird that from what I can see, the PCL_INSTANTIATE_PointCloud()
macro is never used in the codebase, so it's not like there's precompiled instantiations with different point types are lying around and somehow can get mixed up.
I wonder if it is coincidental that PointXYZL
and PointXYZRGB
are neighbours in this list
#define PCL_XYZ_POINT_TYPES \
(pcl::PointXYZ) \
(pcl::PointXYZI) \
(pcl::PointXYZL) \
(pcl::PointXYZRGBA) \
…
Once I have this reappear, I will try again on Linux, but I don't think I've ever seen this issue outside of macOS ARM. I suppose a compiler / linker bug is not completely out of the questions, as it is a new-ish platform.
good idea, although so far I think I've only seen it on macos where valgrind is unavailable.
Hm, that's annoying
PCL is compiled from source (partly for this reason) and uses
-march=native
(or-mtune=native
on arm) just like my client code.
Just to be sure: everything happens on the same computer? No cross-compilation?
It's also weird that from what I can see, the
PCL_INSTANTIATE_PointCloud()
macro is never used in the codebase, so it's not like there's precompiled instantiations with different point types are lying around and somehow can get mixed up.
True, although if e.g. VoxelGrid
gets instantiated for some type, it will likely also instantiate PointCloud
for the same type.
I wonder if it is coincidental that
PointXYZL
andPointXYZRGB
are neighbours in this list#define PCL_XYZ_POINT_TYPES \ (pcl::PointXYZ) \ (pcl::PointXYZI) \ (pcl::PointXYZL) \ (pcl::PointXYZRGBA) \ …
That's PointXYZRGBA
, while PointXYZRGB
is one below in the list. In your logs I only see PointXYZRGB
. I guess you can try to shuffle this list and see whether ASAN still reports PointXYZL
or another type
Oops yeah I misread. Can confirm, everything is compiled locally on the same machine.
This same problem (at least I assume its the same, as the crash is at the exact same location) produces an lldb backtrace like this
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
* frame #0: 0x00000001a400ad98 libsystem_kernel.dylib`__pthread_kill + 8
frame #1: 0x00000001a403fee0 libsystem_pthread.dylib`pthread_kill + 288
frame #2: 0x00000001a3f7a340 libsystem_c.dylib`abort + 168
frame #3: 0x00000001a3e5c8c0 libsystem_malloc.dylib`malloc_vreport + 552
frame #4: 0x00000001a3e5ff34 libsystem_malloc.dylib`malloc_report + 64
frame #5: 0x00000001a3e4ecf4 libsystem_malloc.dylib`free + 300
frame #6: 0x00000001000dbab0 autocrane-core_main`Eigen::internal::handmade_aligned_free(ptr=0x0000600000087850) at Memory.h:118:5
frame #7: 0x00000001000dba7c autocrane-core_main`Eigen::internal::aligned_free(ptr=0x0000600000087850) at Memory.h:206:5
frame #8: 0x00000001009be030 autocrane-core_main`Eigen::aligned_allocator<pcl::PointXYZ>::deallocate(this=0x000060000369c7f8, p=0x0000600000087850, (null)=1) at Memory.h:897:5
frame #9: 0x00000001009bdee0 autocrane-core_main`std::__1::allocator_traits<Eigen::aligned_allocator<pcl::PointXYZ> >::deallocate(__a=0x000060000369c7f8, __p=0x0000600000087850, __n=1) at allocator_traits.h:282:13
frame #10: 0x00000001009bdd28 autocrane-core_main`std::__1::__vector_base<pcl::PointXYZ, Eigen::aligned_allocator<pcl::PointXYZ> >::~__vector_base(this=0x000060000369c7e8) at vector:488:9
frame #11: 0x00000001009bdbfc autocrane-core_main`std::__1::vector<pcl::PointXYZ, Eigen::aligned_allocator<pcl::PointXYZ> >::~vector(this=0x000060000369c7e8 size=0) at vector:579:5
frame #12: 0x00000001009bd9f0 autocrane-core_main`std::__1::vector<pcl::PointXYZ, Eigen::aligned_allocator<pcl::PointXYZ> >::~vector(this=0x000060000369c7e8 size=0) at vector:574:5
frame #13: 0x00000001009beaa8 autocrane-core_main`pcl::PointCloud<pcl::PointXYZ>::~PointCloud(this=0x000060000369c7c0) at point_cloud.h:172:21
frame #14: 0x00000001009bea78 autocrane-core_main`pcl::PointCloud<pcl::PointXYZ>::~PointCloud(this=0x000060000369c7c0) at point_cloud.h:172:21
frame #15: 0x00000001013593f8 autocrane-core_main`std::__1::__shared_ptr_emplace<pcl::PointCloud<pcl::PointXYZ>, Eigen::aligned_allocator<pcl::PointCloud<pcl::PointXYZ> > >::__on_zero_shared(this=0x000060000369c7a0) at shared_ptr.h:315:24
…
<mycode>
…
frame #24: 0x000000010aee908c dyld`start + 520
But it does not seem to be alignment-related, as the pointer to deallocate is divisible by 16, which is the default alignment (and the one used here) on macos ARM.
With a debug build of pcl, the code fails at a different point, which is after the one reported above:
* thread #1, queue = 'com.apple.main-thread', stop reason = hit program assert
frame #0: 0x00000001a400ad98 libsystem_kernel.dylib`__pthread_kill + 8
frame #1: 0x00000001a403fee0 libsystem_pthread.dylib`pthread_kill + 288
frame #2: 0x00000001a3f7a340 libsystem_c.dylib`abort + 168
frame #3: 0x00000001a3f79754 libsystem_c.dylib`__assert_rtn + 272
* frame #4: 0x00000001013ea3e8 autocrane-core_main`void Eigen::MapBase<Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, 4096, 1> const, 32, Eigen::InnerStride<1> >, 0>::checkSanity<Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, 4096, 1> const, 32, Eigen::InnerStride<1> > >(this=0x000000016fdf1110, (null)=0x0000000000000000) > (0)), void*>::type) const at MapBase.h:196:7
frame #5: 0x00000001013ea2f4 autocrane-core_main`Eigen::MapBase<Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, 4096, 1> const, 32, Eigen::InnerStride<1> >, 0>::MapBase(this=0x000000016fdf1110, dataPtr=0x000000016fdf1130, rows=42, cols=1) at MapBase.h:177:7
frame #6: 0x00000001013eb074 autocrane-core_main`Eigen::MapBase<Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, 4096, 1> const, 32, Eigen::InnerStride<1> >, 0>::MapBase(this=0x000000016fdf1110, dataPtr=0x000000016fdf1130, rows=42, cols=1) at MapBase.h:173:5
frame #7: 0x00000001013ea6a0 autocrane-core_main`void Eigen::RefBase<Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, 4096, 1> const, 32, Eigen::InnerStride<1> > >::construct<Eigen::Matrix<float, -1, 1, 0, 4096, 1> >(this=0x000000016fdf1110, expr=0x000000016fdf1130) at Ref.h:109:40
frame #8: 0x00000001013ea1f4 autocrane-core_main`void Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, 4096, 1> const, 32, Eigen::InnerStride<1> >::construct<Eigen::Block<Eigen::Matrix<float, -1, 1, 0, -1, 1> const, -1, 1, false> >(this=0x000000016fdf1110, expr=0x000000016fdf1050, (null)=false_type @ 0x000000016fdf0f9f) at Ref.h:277:13
frame #9: 0x00000001013ea110 autocrane-core_main`Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, 4096, 1> const, 32, Eigen::InnerStride<1> >::Ref<Eigen::Block<Eigen::Matrix<float, -1, 1, 0, -1, 1> const, -1, 1, false> >(this=0x000000016fdf1110, expr=0x000000016fdf1050, (null)=0x0000000000000000)(Traits::match<Eigen::Block<Eigen::Matrix<float, -1, 1, 0, -1, 1> const, -1, 1, false> >::ScalarTypeMatch), Eigen::Block<Eigen::Matrix<float, -1, 1, 0, -1, 1> const, -1, 1, false> >::type*) at Ref.h:253:7
frame #10: 0x00000001013e4b64 autocrane-core_main`Eigen::Ref<Eigen::Matrix<float, -1, 1, 0, 4096, 1> const, 32, Eigen::InnerStride<1> >::Ref<Eigen::Block<Eigen::Matrix<float, -1, 1, 0, -1, 1> const, -1, 1, false> >(this=0x000000016fdf1110, expr=0x000000016fdf1050, (null)=0x0000000000000000)(Traits::match<Eigen::Block<Eigen::Matrix<float, -1, 1, 0, -1, 1> const, -1, 1, false> >::ScalarTypeMatch), Eigen::Block<Eigen::Matrix<float, -1, 1, 0, -1, 1> const, -1, 1, false> >::type*) at Ref.h:249:5
frame #11: 0x00000001013e4674 autocrane-core_main`void Eigen::internal::stable_norm_impl_inner_step<Eigen::Matrix<float, -1, 1, 0, -1, 1>, float>(vec=0x000000016fdf52d8, ssq=0x000000016fdf5174, scale=0x000000016fdf517c, invScale=0x000000016fdf5178) at StableNorm.h:81:34
frame #12: 0x00000001013e44e4 autocrane-core_main`Eigen::Matrix<float, -1, 1, 0, -1, 1>::RealScalar Eigen::internal::stable_norm_impl<Eigen::Matrix<float, -1, 1, 0, -1, 1> >(vec=0x000000016fdf52d8, (null)=0x0000000000000000) at StableNorm.h:101:3
frame #13: 0x00000001013e41d8 autocrane-core_main`Eigen::MatrixBase<Eigen::Matrix<float, -1, 1, 0, -1, 1> >::stableNorm(this=0x000000016fdf52d8) const at StableNorm.h:219:10
frame #14: 0x000000010145c6b0 autocrane-core_main`Eigen::LevenbergMarquardt<Eigen::NumericalDiff<pcl::registration::TransformationEstimationLM<pcl::PointXYZ, pcl::PointXYZ, float>::OptimizationFunctorWithIndices, (Eigen::NumericalDiffMode)0>, float>::minimizeInit(this=0x000000016fdf52b8, x=0x000000016fdf5438) at LevenbergMarquardt.h:203:18
frame #15: 0x000000010145bebc autocrane-core_main`Eigen::LevenbergMarquardt<Eigen::NumericalDiff<pcl::registration::TransformationEstimationLM<pcl::PointXYZ, pcl::PointXYZ, float>::OptimizationFunctorWithIndices, (Eigen::NumericalDiffMode)0>, float>::minimize(this=0x000000016fdf52b8, x=0x000000016fdf5438) at LevenbergMarquardt.h:160:46
frame #16: 0x00000001013e0570 autocrane-core_main`pcl::registration::TransformationEstimationLM<pcl::PointXYZ, pcl::PointXYZ, float>::estimateRigidTransformation(this=0x0000600002924118, cloud_src=0x000060000302cc80, indices_src=size=42, cloud_tgt=0x0000600003017980, indices_tgt=size=42, transformation_matrix=0x000000011d204cf0) const at transformation_estimation_lm.hpp:196:17
frame #17: 0x00000001013e088c autocrane-core_main`pcl::registration::TransformationEstimationLM<pcl::PointXYZ, pcl::PointXYZ, float>::estimateRigidTransformation(this=0x0000600002924118, cloud_src=0x000060000302cc80, cloud_tgt=0x0000600003017980, correspondences=size=42, transformation_matrix=0x000000011d204cf0) const at transformation_estimation_lm.hpp:235:3
frame #18: 0x000000010136b40c autocrane-core_main`pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float>::computeTransformation(this=0x000000011d204c20, output=0x000060000302cda0, guess=0x000000016fdf5e70) at icp.hpp:218:33
frame #19: 0x0000000101466110 autocrane-core_main`pcl::Registration<pcl::PointXYZ, pcl::PointXYZ, float>::align(this=0x000000011d204c20, output=0x000060000302cda0, guess=0x000000016fdf5e70) at registration.hpp:212:3
Here it complains about actual alignment issues.
However this may be unrelated. The code is executed inside an OpenMP parallel for
, and removing that makes this error go away.
I often have this issue – not reproducible minimally – that my PCL client code crashes with some sort of
message, that usually goes away when I
#define PCL_NO_PRECOMPILE
at the top of my offending source file. With clang's address sanitizer, I get interesting reports that seem to indicate that a point cloud created as one point type gets deallocated as a different one. For instance check out this backtrace:So why is the deallocator called on the correct
PointXYZRGB
points, but the filter class somehow creates data of typePointXYZL
? The presence or absence of-DPCL_ONLY_CORE_POINT_TYPES=ON
and-DNO_EXPLICIT_INSTANTIATIONS
makes no difference by the way.What can I do to debug this?