Closed m-h-williams closed 21 hours ago
I looked at '2024-09-10 13-24-05.tlog' from the Dropbox share, and I can see the problem, but I don't know how it happened.
Typical operation looks like this:
The stream groups are described here for reference.
You can use QGC to set the rates for each stream group. I think the default is 3Hz for EXTRA3.
Using MAVExplorer.py we can look at the what happened. QGC clearly sent the request, several times (EXTRA3 is req_stream_id 12):
MAV> dump REQUEST_DATA_STREAM
...
2024-09-10 11:31:51.745 REQUEST_DATA_STREAM {target_system : 1, target_component : 1, req_stream_id : 12, req_message_rate : 3, start_stop : 1}
...
Then I looked at the parameter to see if ArduSub stored the request:
MAV> param show *EXTRA3*
MAV> SR0_EXTRA3 0.000000
...
Nope, it was not saved. Why?
(I looked at other tlog files just to make sure that SR0_EXTRA3 is the correct parameter, and I can clearly see that it is 3.0 for a normal tlog file.)
This log shows very strange behavior at 2024-09-10 11:52:57: ArduSub gets into a weird jag where it spits out a zillion "intro" messages, which are typically caused by 2 things: ArduSub boots up (or reboots), or when QGC starts and says "hello, who are you?" and ArduSub replies. Perhaps ArduSub did not boot correctly? I also see a 3 minute gap in the messages, so it looks like communications went down for a bit. A puzzle.
To avoid this in the future, you could look for SYSTEM_TIME messages in QGC to make sure they are coming and look good. If not, restart the entire system.
I do think you can recover from this: QGC records the laptop time when each message was written to the tlog file in the tlog file. This is read by mavutil, so you can access this in Python: timestamp = getattr(msg, '_timestamp', 0.0)
. Units are seconds. This isn't quite the same as the Pi time when the message was sent, but it should be pretty close.
Here is a graph comparing SYSTEM_TIME.time_unix_usec / 1e6 vs _timestamp. (From a good log.) They are pretty close, but they do drift apart to 0.4s over 25 minutes:
I hope this helps.
Thank you so much for digging into this and providing such a detailed response, Clyde! I agree that ArduSub's lag in this case is a bit of a puzzle, especially considering the ROV appeared to be functioning normally during these surveys.
Checking the SYSTEM_TIME
messages in QGC while surveying is a straightforward and practical way to prevent this from happening again—I’ll make sure to incorporate that into our workflow.
I’m also relieved to hear you think the logs can still be salvaged! I’ll test the recovery suggestion using _timestamp
in Python to align the events in the tlog with the laptop time. Given that the drift (0.4s over 25 minutes) is within acceptable limits for our purposes, this should work well as a fallback for analyzing problematic logs.
Thanks again for your help—I really appreciate it!
Using _timestamp
indeed worked for these files. Thanks again for the suggestion, Clyde! I uploaded a separate script (tlog_csv_timestamp.py) for these instances.
@clydemcqueen I'm hoping you might have some insight on this issue.
It appears that the
SYSTEM_TIME
message, which I’ve been using to collect date and time for surveys, is missing from the.tlog
files for two survey days (09/10/2024 and 09/11/2024). Without these timestamps, I wasn’t able to convert the.tlog
files to CSV using the (tlog_transects_to_csv.py).To work around this, I attempted to gather timestamps from the GPS data, as the UAV log viewer shows the GPS coordinates were populated correctly. However, the system time appears to be missing from the GPS data (
GPS_RAW_INT
) as well.I used the following script to inspect the timestamp information:
Output:
It seems like the
GPS_RAW_INT
timestamps are relative to the system boot time rather than a real-world time. Unfortunately we don't have the start time for these surveys as they were completed prior to us standardizing the method of recording the start and end time in our field notebook.Do you have any idea as to why we would be missing time data from these logs so that we can avoid this in the future?