Analyze missing messages (if any) in the recordings

sbrodeur commented 7 years ago

Create a script that parses the rosbag files and estimate the number of missing messages based on the mean sampling rate, then output some statistics.

See the script below for an example of reading a rosbag file: https://github.com/sbrodeur/ros-icreate-bbb/blob/master/src/action/scripts/record/data/convert_rosbag.py

sgcarrier commented 7 years ago

Did some analysis with the new python script. Tested on S4,S3 and S2 of recordings in C1-4016. Note S4 was controlled and recorded from laptop.

In all topics, less than 2% of packets were considered 'dropped' (here dropped is that 2 messages had a distance of more than the mean plus 3 sigma).

Here is the histogram of the right video (which was being streamed at the same time as being recorded) of the S4 in room C1-4016. Black line is the mean, red are the standard deviation. Number of messages on the Y axis and time between messages in the X axis: histogram_video_right_s4_c14016

sbrodeur commented 7 years ago

The bins are quite large, which makes the visual analysis difficult. Can you try with 50 times more bins?

sgcarrier commented 7 years ago

Previous was 10 bins, heres with 300 bins histogram_video_right_s4_c14016

sbrodeur commented 7 years ago

It would be nice to create a text report for each recorded session, to store in a file aside the rosbag (with extension ".stats") . I was able to generate nice table with the Python package tabulate: https://pypi.python.org/pypi/tabulate

from tabulate import tabulate
print tabulate([["/audio/left/raw", "10232", "10.23","2.31", "43"], ["/imu/data", "754", "5.73", "1.04", "13"]], headers=["Data","Total number of samples", "Average frequency rate in Hz", 'std', "Number of dropped samples"], tablefmt="grid", numalign="right", stralign="left", floatfmt=".2f")

This gives me this output:

+-----------------+---------------------------+--------------------------------+-------+-----------------------------+
| Data            |   Total number of samples |   Average frequency rate in Hz |   std |   Number of dropped samples |
+=================+===========================+================================+=======+=============================+
| /audio/left/raw |                     10232 |                          10.23 |  2.31 |                          43 |
+-----------------+---------------------------+--------------------------------+-------+-----------------------------+
| /imu/data       |                       754 |                           5.73 |  1.04 |                          13 |
+-----------------+---------------------------+--------------------------------+-------+-----------------------------+

sgcarrier commented 7 years ago

Heres is an example file for S4

E1_C14016_S4_20161004_180848.stat.txt

Its normal that /irobot_create/cmd_raw is considered to have high drop rate, because it is asynchronous

sbrodeur commented 7 years ago

I think we could remove the column 'mean' since it is redundant with the average rate, and the std deviation should be in Hz (i.e. same unit as average rate). We can also remove non-data topics such as: _/rosout, /rosout_agg, /tf, /irobot_create/cmd_raw, /video/left/camera_info, /video/right/camerainfo. There could be a flag passed to the analysis script to ignore those. Those topics will be filtered out when exporting data into HDF5 format.

Is the average rate calculated from the timestamps of the rosbag (i.e. read_messages()), or the timestamps of the header (i.e. msg.header.stamp) for data messages?

How is the the number of estimated dropped calculated?

sgcarrier commented 7 years ago

Im using read_messages for the timestamps. Also, I estimate that a message have been dropped if the message took more than 1.1 times the mean than the mean time ( if time > mean+mean*1.1). I added the possibility to remove topics from command line arguments but im having trouble with std deviation in frequency. Will commit after thats solved.

sbrodeur commented 7 years ago

We should use the capture timestamps (msg.header.stamp) if it is available, which is the case for all data topics.

sgcarrier commented 7 years ago

Here is the graph for S4 using the messages' timestamps. It's almost too perfect and error never goes past 0.3% in all topics . A lot of topics have 0 errors too.

histogram_video_right_s4_c14016_msgtime

sbrodeur commented 7 years ago

This is because the rosbag timestamps depend on the received time during recording, which can include buffering and network delay. The message timestamps from the headers indicate the capture time at the source, and are thus not affected by that. If it seems perfect than this means that the Beaglebone Black is not overloaded and is able to keep synchronous the data acquisition. This is good news.

sbrodeur commented 7 years ago

Here is an output example with the latest script (see commit a32b723):

stats

This is with a drop threshold of 1.0, meaning reject if period > 2 * average period. Only the audio seems problematic, but since the recording depends on an internal buffer, timestamps may not be accurate. It should be ruled out whether this results in degraded audio. The problem is to be addressed by issue #20.

Note: this is using capture timestamps from message headers.

sbrodeur / ros-icreate-bbb

Analyze missing messages (if any) in the recordings #17