HKUST-Aerial-Robotics / GVINS

Tightly coupled GNSS-Visual-Inertial system for locally smooth and globally consistent state estimation in complex environment.
GNU General Public License v3.0
893 stars 236 forks source link

Running GVINS on ARM device #24

Closed Mil1ium closed 2 years ago

Mil1ium commented 2 years ago

@shaozu Hi,thanks for your nice work.

I just started working on GVINS and I'm trying to use this project for real-time drone navigation. That means, I need to run the program on ARM embedded computer.

Firstly I run GVINS on my PC and laptop and get nice result(sports_field.bag). IMG_7646(20211215-171818)

Then I run GVINS on NVIDIA Jetson TX2 and NVIDIA Jetson Xavier NX, but I got a problem: 1) The _gnss_enupath seems to converge slowly and the results don't look as good as those on my PC/laptop; 2) The path seems to start at a wrong start point and yaw angle.

As these pictures show (Jetson TX2 with CUDA in feature_tracker): IMG_7635 IMG_7637 IMG_7638 IMG_7641

And then I use CUDA in _featuretracker to speed up calculations to reduce CPU load, but it didn't seem to help.

So is this result due to the lack of computing power on embedded computer?? But the CPU load is not too heavy...... Or is it processor architecture (ARM and x86)?

shaozu commented 2 years ago

Hi, @Mil1ium. Thanks for your feedbacks. Unfortunately, I haven't tested the GVINS on ARM-based platforms and currently I cannot not get one on hand to reproduce the issue. But here are my suggestions:

  1. Run the system in docker.
  2. Play the bag in a low rate, like -r 0.5.
  3. Disable the GNSS flag in config and run visual-inertial only. Check if the results still diverge.
Abner0907 commented 2 years ago

@shaozu Hello, can this issues open again? I have the similar issues. Run all the results ok in X86 platform. The same code and same ros bag file, get a bad result in NX platform.

@Mil1ium Hello, have you fix this issue?

Abner0907 commented 2 years ago

@shaozu I have disable the gnss flag. It can get a very nice result. But if enable gnss fusion. it will get a bad result as above.

Mil1ium commented 2 years ago

@Abner0907 Hi,maybe I have figured out this problem.

Try to change to a smaller sliding window size. I changed WINDOW_SIZE to 6 and "solved" this problem, but the cost is a loss of accuracy. This is the result I got(WINDOW_SIZE = 6 with CUDA in feature tracker in Jetson TX2): 1f86cf1d6e5cab39e7781017e5f3250 bc16d4243a237aa9b966dc73412fcc5

If you need better results in real time application, my suggestion is to buy a NUC 10/11, because the performance of Jetson ARM device is really insufficient for optimization.

In our past work, we applied VINS-Mono in TX2 with mapping and path planning and got nice performance in UAV platform. I think, compared to VINS-Mono, the GVINS has more states so the optimization problem requires better performance to ensure real-time performance of this algorithm.

Mil1ium commented 2 years ago

As for the CPU load(Jetson TX2, 4+2 cores), I played the sport_field.bag normally and tested it as follows: 1)disable GNSS and I got nice result, the CPU load is about 50%(4 cores) + 70%(2 cores); 2)enable GNSS and WINDOW_SIZE = 10(default), the CPU load is about 55%(4 cores) + 75%(2 cores); 3)enable GNSS and WINDOW_SIZE = 6, the CPU load is about 55%(4 cores) + 75%(2 cores). These are rough observations. The CPU always seems to be underutilized.

So...maybe it's due to the scheduling efficiency of ROS or OS?

Abner0907 commented 2 years ago

@Mil1ium Hello, Thank you for your reply. I have follow your suggestion, modify the WINDOW_SIZE = 6 and WINDOW_SIZE = 5. It will diverge when GNSS_CONFIG ENABLE.

I have also monitor the cpu usage and memory/swap consumption. it is similar as yours. I think it is maybe not the cpu cause the reason.

I have try to dig out the reason, but nothings found. I will continue to work on this project. I will let you know If anythings find. If you have any ideas, pls let me know, i will also try it.

Abner0907 commented 2 years ago

@Mil1ium Hello, I think I have get the difference for when diverge, if diverged, it will get an uncorrect
rough_xyzt in below line which cause the issues. But I need further debug for the reason. https://github.com/HKUST-Aerial-Robotics/GVINS/blob/d2cf40b49c6eb0e6ad3caa4f613713983be3fd74/estimator/src/estimator.cpp#L594-L600

Abner0907 commented 2 years ago

@Mil1ium Hello, I think I have found the cause reason. The weight is influenced by the obs_sys() function. The uncorrect weight will influenced the intialize state rough_xyzt show as above.

https://github.com/HKUST-Aerial-Robotics/gnss_comm/blob/a2035fec1cc427bba184d25ff1fde8dcf23c9ce1/src/gnss_spp.cpp#L214-L221 Now my result is correct.

Kranthi18reddy commented 2 years ago

@Mil1ium Hello, I think I have found the cause reason. The weight is influenced by the obs_sys() function. The uncorrect weight will influenced the intialize state rough_xyzt show as above.

https://github.com/HKUST-Aerial-Robotics/gnss_comm/blob/a2035fec1cc427bba184d25ff1fde8dcf23c9ce1/src/gnss_spp.cpp#L214-L221 Now my result is correct.

@Abner0907 Hi, please let us know how you solved and get correct result.

trayveet commented 2 years ago

@Mil1ium Hi, I want to get GVINS to work on a Jetson TX2. Can you share info on how yo got it to work. Thanks

Mil1ium commented 2 years ago

@trayveet Hi! I suggest you not to use Jetson TX2, because its CPU performance is too too too weak , and the optimization problem scale is larger compared to VINS-Mono due to the newly added states in state vector. I have tried GVINS on TX2/NX, when I change the window_size to 6, it works, but the results seem to be bad(it takes long to converge and got lower precision).

So, I suggest you to use Morefine M6S, Intel NUC, ASUS PN series or other mini computer. I tried Morefine M6S (Intel Celeron N5105 4-cores) and it works very well(window_size = 10), and the CPU load is about 60%. This device is really cheap so I highly recommend it.

As for CUDA, I tried it in feature tracker, the time consumption seems similar to NOT use CUDA. I think the most time-consuming part is the optimization part which is computed by CPU.

So, choose a device with better CPU.

trayveet commented 2 years ago

@Mil1ium Thanks for the response. I was looking at other platforms, so I will try your suggestion. I will report if I have updates. Very much appreciated.

Camilochiang commented 1 year ago

@Mil1ium Hello, I think I have found the cause reason. The weight is influenced by the obs_sys() function. The uncorrect weight will influenced the intialize state rough_xyzt show as above.

https://github.com/HKUST-Aerial-Robotics/gnss_comm/blob/a2035fec1cc427bba184d25ff1fde8dcf23c9ce1/src/gnss_spp.cpp#L214-L221 Now my result is correct.

@Abner0907 : How did you solve it? Did you force obs_sys to a specific value? in that case, to which? Thanks!

Abner0907 commented 1 year ago

@Camilochiang I have done some changes as below which suits my F9P. this is also refer the rtklib drivers. Screenshot from 2023-02-02 18-07-28