Bluetooth: HCI Controller to Host ISO data timestamp synchronization

Is your feature request related to a problem? Please describe. Multiple headsets to synchronize to the gateway clock... and play relatively using the presentation delay.

The two problems, both on either (gateway and headset) application side is:

gateway synchronize its audio source sampling clock using the HCI Tx Sync command with the controller radio timing
headset synchronize its playback sink audio pll using the rx timestamp from the controller

Describe the solution you'd like Solution alternative:

On CIS established (because APP core gets a ISO data with timestamp thereafter) NETCORE triggers a realtime signal (nRF PPI/DPPI), APPCORE captures the System Timer (NRF_RTC's) counter value (by subscribing). Application reads the captured (NRF_RTC) counter value on the first ISO data recv callback, this is the coarse tick with remainder 0 that is in sync with the Controller (NRF_RTC) counter based timestamp. The difference between captured value in microseconds and ISO data timestamp be used as presentation delta value to trigger audio playback (TASKS_START of NRF_I2S). Problem with RTC CAPTURE is there will be +30 us jitter if captured from a 1 us resolution. Instead, its better to use independent NRF_RTC that gets started/cleared on the PPI that is also captured from a NRF_RTC tick resolution (now, we need a solution for the +30 us remainder!)
TBD
TBD

Additional context

21st Feb 2023:

vich: I believe the rx timestamp is used to drift compensate on the playback. There is no sync on the gateway either for now, here I believe the HCI Tx Sync command will be used by external host to drift compensate

rubin: Controller is using the LFCLK for time management, and that clock drifts relative to the clock in the app core. So I am under the impression that this needs to be handled somehow - the timestamps in the packets are timestamps in the network core, and those cannot necessarily be used on the app core.

vich: App core knows the expected timestamp delta (ISO interval) and the actual timestamp delta comes from the network core. The delta of the two timestamp delta is what is used to calibrate the audio PLL. That is my understanding of the drift compensation code in audio application.

rubin: That is true. But in order to play back audio at the same time on multiple devices, the app core needs to know the absolute time, and not only the delta, right? Maybe this is not really a problem. Could be that the first SDU received sets the reference correctly.

vich: The the multiple headsets are synchronized to the gateway clock... and the play relatively using the presentation delay the two problems, both application side is:

gateway synchronize its sampling clock using the HCI Tx Sync command
headset synchronize its playback audio pll using the rx timestamp

rubin: Looking at (2) only for now. Here is a scenario I believe illustrates the complexity when the host and controller do not share a common clock source.

Controller sends up SDUs with timestamp T_C + N * SDU_Interval
Host receives this PDU and captures a timestamp T_H. T_H = T_C + delay_c_to_h
First packet: Host needs to playback this SDU at T_Presentation = T_C + Presentation_Delay = T_H - delay_c_to_h + Presentation_Delay
Following packets: Host needs to playback this SDU at T_Presentation_Prev + N * SDU_Interval

The problem here is step 3. There needs to be some knowledge about the difference between the clock in the controller and host. Once this is known, the rest is simple (step 4)

vich: Yes, delta between the Host Clock and Controller Clock is required to apply the presentation delay. As a gateway LE Read ISO Tx Sync can be used to get, for a Packet Sequence Number transmitted, delta = T_C - T_H. For a headset, maybe capture T_H on the first Rx timestamp?

vich: For precision, on nRF53, it will have to be DPPI that captures the host timer on every LE CIS Established. On external hosts, time be capture on the HCI transport. headset from two different manufactures will have LR sync mismatch unless host time is captured by some means by the Controller.

rubin: I believe this is a problem. "two different manufactures" may in this case also be "two different SW revisions from the same manufacturer"

vich: yes. unless the time is captured real-time. i.e. in nRF53 using DPPI. If using soft real time, then every SW revision needs to have a calibration timing to compensate for change in delays between Controller to Host HCI transport. Say, HCI_UART, then UART baud rate will have to be considered by host capturing the time based on CIS Established or First Rx ISO Data.

rubin: Also, I'm wondering if it is a problem that the time between the app and network core drift. Lets say we have an SDU interval of 100, and the controller puts the timestamps 101, 202, 303, 404 etc in the SDUs to the host. The app core can then know that the network core clock drifts with 1 unit per SDU interval relative to the peer device. It does not know how much the network core drifts relative to the app core. The app core needs to be aware of this in order to play the audio correctly. If the app core clock runs too fast, it will sometimes drop one frame.

rubin: How would you know the drift between app and network core?

vich: This will have to be over a period of time, i.e. water mark level in audio buffer if if the buffers hit high water mark level, speed up the host clock, and vice versa. (I am think aloud here, but not verified)

rubin: Yeah. It is likely possible to do something along those lines. Probably need to think about something that works nicely Maybe something as simple as triggering a DPPI every Nth RTC tick can do the trick

vich: remember, its the audio clock in the host that has to sync with the audio clock on the gateway.

rubin: Yes. And in order to that, it needs to know the drift between gateway and network core, but also between network core and app core

vich: gateway to network core drift is reflected in the Rx timestamp. gateway to host drift would be reflected in the audio buffers. Could just the audio buffers be used as sync between gateway and host? am I simplifying it too much?

rubin: Maybe, I don't know yet. I guess this depends a bit on how it is implemented, if we allow packets to be dropped or not etc

vich: Worst case drift will need to account for both watermark and under/overflow (drop/silence)

29th March 2023:

vich: a PPI could be designed as signal, and have a duplicate NRF_RTC on APP core that is on the same LFCLOCK and use a NRF_TIMER for 1 us resolution. Would this work?

rubin: I would assume that would work And that is possibly the only reasonable solution...

vich: We would need two PPI channels, right? One for starting the RTC on the app core, and one to clear the TIMER when the timer on the network core is cleared

vich: no, one PPI that will start the RTC and keep it running... we get the timestamps that is relative..... we can keep the RTC on app core free running and this PPI clears it on first ISO Rx PDU on that CIS

rubin: How would we know when the TIMER0 is cleared on the net core then?

vich: the PPI has to be signalled on a RTC compare on the netcore core we do not need TIMER0 info. once the RTCs are in sync, app core will use the timestamp to realize the presentation from the RTC ...

31st March 2023:

vich: Hi Rubin, I was busy this week with some fixes. I think without diagrams it could be difficult to understand. But if you want to update your diagrams together we can do together to understand how a PPI would clear/start NRF_RTC on the appcore and the timestamp of the first ISO data would be the delta in microseconds between NRF_RTC in appcore versus the successive ISO data timestamps received over HCI.

rubin: Yeah, we can do that. I also thought that we could make it even simpler -> let the app core run both an RTC and TIMER. The app core can then determine the drift itself based upon comparing the values of those

vich: Yes, just having RTC and TIMER in appcore is also a solution. Is a NRF_TIMER required by I2S?

rubin: That I don't know. Maybe it is only used to check if there is sufficient time available to render the audio

vich: Just to be clear, we are finding a solution for presentation, right? (audio PLL, i.e. drift compensation is solved using the timestamp alone), once we have a RTC running that is in sync with netcore RTC, we can use the NRF_RTC + NRF_TIMER for microsecond presentation to trigger NRF_I2S->TASK_START, right? i do not see a need to have a continuous running NRF_TIMER on appcore

rubin: Yes, you are right. To me it is unclear how we can achieve microsecond accurate presentation on the app core without running a TIMER

vich: In the Zephyr Controller, microsecond time is triggered using a coarse tick plus a remainder in femto seconds (+/- 15 us range) NRF_RTC provides the coarse tick, and then a PPI to start NRF_TIMER for the remainder.

rubin: Yeah, ok. Then we are on the same page. We need the NRF_TIMER for those +- 15 us

vich: NRF_TIMER is started by PPI/DPPI on the NRF_RTC's COMPARE. We need the NRF_TIMER, but it does have to be running until the last +15 us NRF_RTC also only needs to be started when CIS established, and stoppped on CIS terminate

rubin: Can't we just ensure they are started the same time on app and network core?

vich: running NRF_RTC all the time without CIS would be waste current right?

rubin: Starting when a CIS is established becomes complex in a multi-CIS scenario as a new CIS may be established approximately at the same time as another is teared down

rubin: Does it matter when the SYNC signal is sent? As long as the app core captures it, it knows the offset between the app and net core RTC

vich: no, it can be done when Controller RTC is started everytime.... but we cannot sync to system timer on appcore, it will have to be a dedicated NRF_RTC

rubin: I see. There is no way of capturing an RTC value

vich: We can capture the System Timer (NRF_RTC) This is a proposal, on CIS established (because APP core gets a ISO data with timestamp thereafter) NETCORE triggers a PPI, APPCORE captures the System Timer NRF_RTC's counter value. Application reads the captured NRF_RTC value, this is the coarse tick with remainder 0 that is in sync with the Controller NRF_RTC based timestamp. Application reads the captured NRF_RTC value on the first ISO data recv callback

vich: This is the proposal:

Zephyr System Timer (NRF_RTC) on nRF53 series APP core

This is a proposal, on CIS established (because APP core gets a ISO data with timestamp thereafter) NETCORE triggers a PPI, APPCORE captures the System Timer NRF_RTC's counter value (by subscribing). Application reads the captured NRF_RTC value on the first ISO data recv callback, this is the coarse tick with remainder 0 that is in sync with the Controller NRF_RTC based timestamp.

The difference between captured value in microseconds and ISO data timestamp be used as presentation delta value to trigger TASKS_START of NRF_I2S.

Comments welcome.

vich: only problem with RTC CAPTURE is there will be +30 us jitter. Instead, its better to use independent NRF_RTC that gets started/cleared on the PPI

zephyrproject-rtos / zephyr

Bluetooth: HCI Controller to Host ISO data timestamp synchronization #57571