ouster-lidar / ouster-ros

Official ROS drivers for Ouster sensors (OS0, OS1, OS2, OSDome)
https://ouster.com
Other
116 stars 139 forks source link

Huge delay (about 2 years) in ros messages timestamps when using PTP #366

Open mcamurri opened 5 days ago

mcamurri commented 5 days ago

Describe the bug When the lidar is set in PTP mode, the timestamps produced have a fixed offset of 74986812 seconds (about 2 years) on a correctly configured machine with a PTP supported NIC.

The output of lspci -nnk | grep -A2 Ethernet is:

2d:00.0 Ethernet controller [0200]: Intel Corporation Device [8086:125b] (rev 04)
    Subsystem: ASRock Incorporation Device [1849:125b]
    Kernel driver in use: igc
--
2f:00.0 Ethernet controller [0200]: Intel Corporation Device [8086:125b] (rev 04)
    Subsystem: ASRock Incorporation Device [1849:125b]
    Kernel driver in use: igc

The output of ethtool -T enp45s0 is:

Time stamping parameters for enp45s0:
Capabilities:
    hardware-transmit     (SOF_TIMESTAMPING_TX_HARDWARE)
    software-transmit     (SOF_TIMESTAMPING_TX_SOFTWARE)
    hardware-receive      (SOF_TIMESTAMPING_RX_HARDWARE)
    software-receive      (SOF_TIMESTAMPING_RX_SOFTWARE)
    software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
    hardware-raw-clock    (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
Hardware Transmit Timestamp Modes:
    off                   (HWTSTAMP_TX_OFF)
    on                    (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
    none                  (HWTSTAMP_FILTER_NONE)
    all                   (HWTSTAMP_FILTER_ALL)

The Ouster diagnostic and ptp config files are attached here: diagnostic_and_config.zip

To Reproduce

  1. connect the lidar to the ethernet port (in my case enp45s0) and set it to link local mode

  2. On one terminal, run the ptp4l utility (config file is attached):

    sudo ptp4l -m -f /etc/linuxptp/ptp4l.conf

    the output is as follows:

    ptp4l[1817.658]: selected /dev/ptp0 as PTP clock
    ptp4l[1817.658]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE
    ptp4l[1817.658]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
    ptp4l[1825.211]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES
    ptp4l[1825.211]: selected local clock 9c6b00.fffe.0118e9 as best master
    ptp4l[1825.211]: assuming the grand master role
  3. On another terminal, run the phc2sys to sync the NIC to the CLOCK_REALTIME:

    sudo phc2sys -w -m -s CLOCK_REALTIME -c enp45s0 -O 0

    Example output:

    phc2sys[1900.970]: enp45s0 sys offset        -8 s2 freq  -24787 delay   4992
    phc2sys[1901.970]: enp45s0 sys offset        -7 s2 freq  -24788 delay   5715
    phc2sys[1902.971]: enp45s0 sys offset        25 s2 freq  -24758 delay   5721
    phc2sys[1903.971]: enp45s0 sys offset        10 s2 freq  -24766 delay   5709
    phc2sys[1904.972]: enp45s0 sys offset         8 s2 freq  -24765 delay   5728
    phc2sys[1905.972]: enp45s0 sys offset       -12 s2 freq  -24782 delay   5728
    phc2sys[1906.973]: enp45s0 sys offset        54 s2 freq  -24720 delay   5722
    phc2sys[1907.973]: enp45s0 sys offset       -21 s2 freq  -24779 delay   5696
    phc2sys[1908.974]: enp45s0 sys offset       -12 s2 freq  -24776 delay   5696
    phc2sys[1909.974]: enp45s0 sys offset        71 s2 freq  -24697 delay   4115
  4. On a third terminal, run the compiled driver:

    roslaunch ouster_ros driver.launch sensor_hostname:=os-122415000093.local
  5. Monitor the delay:

    rostopic delay /ouster/imu

    example output is:

    subscribed to [/ouster/imu]
    average delay: 74986812.741
    min: 74986812.494s max: 74986813.430s std dev: 0.30612s window: 181
    average delay: 74986812.653
    min: 74986812.494s max: 74986813.430s std dev: 0.27229s window: 281
    average delay: 74986812.611
    min: 74986812.494s max: 74986813.430s std dev: 0.24388s window: 381
    average delay: 74986812.587
    min: 74986812.494s max: 74986813.430s std dev: 0.22207s window: 481
    average delay: 74986812.571

    which is consistent with the timestamps, which are in the order of 165xxxxxxx nanoseconds while the actual ones from all other sensors are 172xxxxxxx

The PTP syncing is correct as evidenced by the outputs above, but as double check the output of the CLOC_REALTIME and the NIC itself are also in sync:

phc_ctl CLOCK_REALTIME get
phc_ctl[3557.270]: clock time is 1726158140.991257534 or Thu Sep 12 18:22:20 2024

sudo phc_ctl enp45s0 get
phc_ctl[3557.275]: clock time is 1726158140.996777650 or Thu Sep 12 18:22:20 2024

Screenshots Configuration on the dashboard (everything was left untouched except for the PTP timestamp mode on the Ouster): image image

Platform (please complete the following information):

Samahu commented 4 days ago

Hi @mcamurri,

The ouster-ros driver currently adds -37 seconds (by default) to the timestamps values returned when PTP mode is used but you can easily prevent this by resetting the launch parameter ptp_utc_tai_offset to zero. Other than that all the time values reported are mainly coming from the sensor. So a difference of 2 years suggests that the sensor isn't running in Sync. This may happen when:

Can you examine the the output of GET /api/v1/time/ptp and check the status and reported timestamps.

Also since this doesn't appear to be a driver problem it might be a good question to post on Ouster https://community.ouster.com.

mcamurri commented 1 day ago

I actually thought of the opposite, that there is some problem with the driver (perhaps a mismatch with the SDK since it seems old).

I'm not using a ethernet-usb adapter, and I have used the same computer with many other ouster devices before with no problem. Further, as shown by the outputs, the PTP is working correctly on the machine.

I'll ask someone to the lab to post the output of the HTTP call you suggested.

mcamurri commented 1 day ago

As you suggested, the issue has been reposted here: https://community.ouster.com/t/huge-delay-about-2-years-in-ros-messages-timestamps-when-using-ptp/195

Samahu commented 17 hours ago

I actually thought of the opposite, that there is some problem with the driver (perhaps a mismatch with the SDK since it seems old).

No the SDK version shouldn't impact the PTP as the SDK C++ client wasn't updated for the last two releases. In any case, the SDK client will be updated to a more a recent version as part of #363

I rarely use PTP in my daily development routine so I can't tell for certain that the ouster-ros driver is at fault here, but this part of the code wasn't much altered since it was introduced. As I noted above the driver merely forward the timestamp it receives from the sensor. The only thing that the driver does in addition to the SDK is by default it subtracts 37 seconds when PTP mode is in use but this can be easily eliminated by passing ptp_utc_tai_offset:=0 to the launch command. This is done to align the reported time with the machine time which typically uses UTC.