MegviiRobot / MegBA

MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment
Apache License 2.0
450 stars 61 forks source link

The error of the misaligned address appears on testing a very small SFM problem. #40

Closed ChenMa2017 closed 1 year ago

ChenMa2017 commented 1 year ago

I test the MegBA using a very small SFM problem. It has two cameras, 600 3D points, and 1200 observations. The SFM problem has the format of Bundle Adjustment in the Large. BAL The BAL_Double aborts after the error of " cudaErrorMisalignedAddress: misaligned address".

The input command is " ./BAL_Double --path ~/Data/problem-2-599-pre.txt --world_size 1 --max_iter 10 --solver_tol 1e-1 --solver_refuse_ratio 1 --solver_max_iter 10 --tau 1e4 --epsilon1 1 --epsilon2 1e-10" .

The whole output on the screen is "solving /Data/problem-2-599-pre.txt, world_size: 1, max iter: 10, solver_tol: 0.1, solver_refuse_ratio: 1, solver_max_iter: 10, tau: 10000, epsilon1: 1, epsilon2: 1e-10 Start with error: 45932, log error: 4.66212, elapsed 8 ms terminate called after throwing an instance of 'thrust::system::system_error' what(): after reduction step 1: cudaErrorMisalignedAddress: misaligned address Aborted".

But when I remove one 3D point and its 2 observations, the MegBA works. The edited SFM problem has two cameras, 599 3D points, and 1198 observations.

The output on the screen is "solving /home/machen/Documents/Data/ColMap_MegBA/problem-2-599-pre.txt, world_size: 1, max iter: 10, solver_tol: 0.1, solver_refuse_ratio: 1, solver_max_iter: 10, tau: 10000, epsilon1: 1, epsilon2: 1e-10 Start with error: 45867, log error: 4.6615, elapsed 8 ms Iter 1 error: 150.088, log error: 2.17635, elapsed 14 ms Iter 2 error: 146.737, log error: 2.16654, elapsed 19 ms Iter 3 error: 145.811, log error: 2.16379, elapsed 24 ms Iter 4 error: 144.645, log error: 2.1603, elapsed 28 ms Iter 5 error: 143.325, log error: 2.15632, elapsed 33 ms Iter 6 error: 142.183, log error: 2.15285, elapsed 38 ms Iter 7 error: 141.412, log error: 2.15049, elapsed 42 ms Iter 8 error: 140.979, log error: 2.14915, elapsed 47 ms Iter 9 error: 140.773, log error: 2.14852, elapsed 51 ms Iter 10 error: 140.686, log error: 2.14825, elapsed 56 ms Finished".

Could you give any suggestions about the error?

ChenMa2017 commented 1 year ago

This happens on my laptop's RTX 2080. On the RTX 3090, everything goes well.

aggestsfw commented 6 months ago

这个我测试了,好像是只要相机帧数是奇数就会报错,偶数就没问题