swri-robotics / novatel_gps_driver

ROS driver for NovAtel GPS / GNSS receivers
BSD 3-Clause "New" or "Revised" License
172 stars 132 forks source link

Queue overflow problem #98

Open MaxandreOgeret opened 4 years ago

MaxandreOgeret commented 4 years ago

Hello everyone I have a problem with this novatel driver. It seems that it randomly doesn't start properly and doesn't publish IMU data Here are the errors I get :+1:

Node: /gps/novatel
Time: 14:19:50.038780807 (2020-10-02)
Severity: Warn
Published Topics: /gps/bestpos, /gps/bestvel, /gps/corrimudata, /gps/dual_antenna_heading, /gps/fix, /gps/gpgga, /gps/gprmc, /gps/gps, /gps/heading2, /gps/imu, /gps/inscov, /gps/inspva, /gps/inspvax, /gps/insstdev, /rosout, /unprocessed_diagnostics

INSPVA queue overflow.

Location:
/tmp/binarydeb/ros-melodic-novatel-gps-driver-3.9.0/src/novatel_gps.cpp:NovatelGps::ReadResult novatel_gps_driver::NovatelGps::ParseBinaryMessage:1122
Node: /gps/novatel
Time: 14:19:50.026237548 (2020-10-02)
Severity: Warn
Published Topics: /gps/bestpos, /gps/bestvel, /gps/corrimudata, /gps/dual_antenna_heading, /gps/fix, /gps/gpgga, /gps/gprmc, /gps/gps, /gps/heading2, /gps/imu, /gps/inscov, /gps/inspva, /gps/inspvax, /gps/insstdev, /rosout, /unprocessed_diagnostics

CORRIMUDATA queue overflow.

Location:
/tmp/binarydeb/ros-melodic-novatel-gps-driver-3.9.0/src/novatel_gps.cpp:NovatelGps::ReadResult novatel_gps_driver::NovatelGps::ParseBinaryMessage:1100

I dont understand what would cause this problem, when I just restart the driver it works fine. Also what is the purpose of MAX_BUFFER_SIZE and why does reaching it prevents to send data to the IMU topic ?

Thanks a lot for your help !

pjreed commented 4 years ago

In order to publish data to the IMU topic, the driver has to combine data from the INSPVA and CORRIMUDATA logs. It buffers up those logs in a queue as they arrive, and when it is capable of combining them (see the code starting at novatel_gps.cpp:934), it does so, removes them from their queues, and publishes the IMU topic. The purpose of MAX_BUFFER_SIZE is to prevent these queues from becoming too large; if the device is producing one of these messages but not the other, that could potentially cause the queue to keep growing until it ran out of memory and crashed.

If the driver is unable to determine your IMU's internal sampling rate, that could cause this issue, although if that happens, earlier in the log it should print an error message containing "Unknown IMU Type". If that is the problem, though, I would expect it to happen every time.

Something that would be useful for debugging purposes would be to verify that your device is actually producing both of those messages; when you see that warning happen, if you stop the driver and use a tool like cutecom (for serial connections) or nc (for IP connections) to get messages from it, can you verify that it's still producing both CORRIMUDATA and INSPVA?

MaxandreOgeret commented 4 years ago

Thanks a lot for your help. It's indeed strange that it's only happening sporadically. I will try to get a better bag of what is happening and also save the log better.

We use the driver by giving it the novatel IP, so I guess I should use nc. But I have never used this program before. Do you have any tutorial for me to check if it's still producing both CORRIMUDATA and INSPVA ?

I connect to the Novatel with nc :

nc 192.168.19.1 3001

I log CORRIMUDATA and INSPVA

log CORRIMUDATA
or 
log INSPVA

And I should have output for both, is that right ?

pjreed commented 4 years ago

I don't think the device resets its logging settings if the TCP connection is dropped, so you should be able to start the driver, stop it, and then use nc 192.168.19.1 3001 and immediately see what it is logging. If that works, it would be interesting to see if there's a difference in what is being received when the driver is working properly vs. when it's printing those warning messages.

If you need to manually start the logging, try:

log corrimudataa ontime 0.01
log inspvaa ontime 0.01

That should cause both of them to be printed at 100 Hz.

MaxandreOgeret commented 4 years ago

@pjreed Thanks for your help again. I finally had time today to investigate this error a bit more and we get this error message at the moment we start the driver : IMU rate has not been configured; cannot produce sensor_msgs/Imu messages.

It's strange becaume we always start the driver with the same launchfile and the imu_rate is set to 100. But the imu_sample_rate is set to -1

image

Any idea ?

pjreed commented 4 years ago

Hmm, that seems strange. If imu_sample_rate is set to -1, then it should try to automatically determine your IMU's sample rate based on what it sees in the RAWIMUXA log; see this piece of code here: https://github.com/swri-robotics/novatel_gps_driver/blob/master/novatel_gps_driver/src/novatel_gps.cpp#L1338

However, I would expect it to either work or not work every time rather than intermittently. You might try manually setting the imu_sample_rate to the appropriate value for your IMU and see if that helps. My guess is that it's possible that sometimes your device is not printing the RAWIMUXA log for some reason, or there could also be a race condition somewhere in the driver if logs arrive in the wrong order; it's hard for me to test it because I don't have access to a NovAtel with an IMU at the moment.