leifclaesson / ESP1588

IEEE-1588 Precision Time Protocol (PTP) client for ESP8266/ESP32
GNU General Public License v3.0
40 stars 3 forks source link

Microseconds #2

Open PortalHearts opened 1 year ago

PortalHearts commented 1 year ago

Good library here. I appreciate that someone has written it.

However the ReadMe says +/- 1 ms. I am assuming this means millisecond not microseconds? Also your code makes use of millis() instead of micros().

On an ESP32-S3 I measure micros() 1.397 times faster than millis(). I think this may reduce the precision.

Are you able to rewrite this using micros() to get microsecond precision rather than millisecond?

leifclaesson commented 1 year ago

Hi there! I saw your e-mail, how cool that someone is using my library!

Regarding microsecond precision, the problem isn't the lack of a micros() call, the problem is on the network side.

PTP clock packets are multicast packets. On wired networks, multicast and broadcast packets are sent the same way and with the same precision as normal (unicast) packets.

But, on WiFi, multicast packets are sent in the DTIM (Delivery Traffic Indication Message). DTIM, in turn, is sent either every (DTIM Interval 1) or every third (DTIM Interval 3) beacon. The beacon is sent out by the access point once every 102.4 milliseconds (slightly less than 10 times per second) and contains for example the SSID and the encryption type. The beacon is what your phone and laptop listens for in order to populate the list of wireless networks.

So, what is the problem?

Well, because multicast packets are only sent once every DTIM interval (at worst only three times a second!), that means the packets are getting delayed by a huge and variable amount. It's an insane amount of jitter. Averaging would only make it worse, because the delay is only ever positive, never negative obviously (the access point can't sent packets out before it has received them).

PTP simply was not designed for WiFi.

In order to make this work at all, I wrote the library to compare the incoming timestamp of the last few PTP packets we received to our own current time, and only use the value from the one that is newest in the relation to our own time. All older packets are ignored, so we're eating through the jitter as much as possible, resulting in a PTP accuracy of a few milliseconds. The +/- 1ms figure in the project description was probably a bit optimistic in real life.

In practice it's plenty fast enough to visually synchronize lighting effects, which is what I use it for. But microsecond precision it is not.

What could be done to solve the problem? Well, if the WiFi access point was PTP aware, it could in theory make a note of the timestamp when it received the PTP packet, and rewrite the timestamp in the packet based on how long the packet laid waiting inside the access point's multicast packet buffer, at the point of actually transmitting the packet in the DTIM.

Without that, unfortunately, the precision simply isn't there, because we have no way of knowing how delayed the packet is. The usual PTP 2-step and Delay Request mechanisms cannot help one bit, because they cannot account for the variable jitter added by the WiFi access point.

The beacon itself actually contains a microsecond timestamp which could in theory be used. Unfortunately: 1) The ESP8266/ESP32 arduino platform does not make this value available 2) The timestamp isn't synchronized to real world time, it simply counts the number of microseconds the access point has been on.

So, until there are PTP aware access points, I'm afraid we're SOL. Do you happen to know anyone at Ubiquiti? :)

Here is an interesting whitepaper if you want further information. https://www.usenix.org/system/files/atc21-chen.pdf Interestingly, they don't even mention the multicast packet issue. I suppose TSF solves that particular problem. Reading that whitepaper, they dove deeper into this issue than I ever want to. My WiFi ceiling lights can already make a rainbow and synchronize to music. I'm satisfied. :)

PortalHearts commented 1 year ago

Ahh shucks. I did also try the WIFI beacon and it's timestamp is the receiver's stamp so it it pointless.

I do need +/- 80 microsecond accuracy for my application.

I will go old school long jumper wires to trigger interrupts if I have to.

I was contemplating BN-880 GPS module but I'm not sure if the hardware on that slaps the timestamp on each message. Have yet to purchase the module.

Also considering LoRa or ESP's Bluetooth but I haven't gone over those as of yet.

Your ceiling lights gives me a good idea for possibly recreating the Rolls Royce Starlight roof however my present project is not about lights.

And I do not know anyone at Ubiquiti : )

I am also looking into TSF as per your citation.

Thanks for your response BTW.

PortalHearts commented 1 year ago

As per TSF paper.

https://github.com/espressif/esp-idf/blob/cf7e743a9b2e5fd2520be4ad047c8584188d54da/components/esp_wifi/include/esp_wifi.h#L1125 mentions the function esp_wifi_get_tsf_time.
(Alternatively esp_wifi_get_tsf_time takes ~50 microseconds but esp_mesh_get_tsf_time however takes ~ 1.5 microseconds)

Can this be utilized inside the library?

Can the kernel driver be modified?

Also - https://learn.microsoft.com/en-us/windows/win32/iphlp/packet-timestamping - states timestamps are turned off by default and must be enabled. That is a page about Windows though. Not sure if this information extends to what is happening with ESP32.

leifclaesson commented 1 year ago

mentions the function esp_wifi_get_tsf_time.

Interesting! I just tried it.

The ESP8266 appears to be SOL as the Arduino core uses the nonos-sdk, not the esp-idf. Thus, there is no esp_wifi_get_tsf function, in fact there isn't even an esp_wifi.h for it to reside in.

The ESP32, on the other hand, is built on the esp-idf. By including esp_wifi.h I was able to call esp_wifi_get_tsf and get a result!

tsf time: 528833052862
tsf time: 528834052484
tsf time: 528835052846
tsf time: 528836052824
tsf time: 528837052820
tsf time: 528838052343
tsf time: 528839052333
tsf time: 528840052827
tsf time: 528841052834
tsf time: 528842052338
tsf time: 528843052831
tsf time: 528844052825
tsf time: 528845052800
tsf time: 528846052798

This is the result of me calling it once per second. So, there is some hope there, the only question is what we could do with it. It's still not referenced to anything. If we could know when the beacons are received, because then it might be possible to make use of it by using an ESP32 with both LAN and WiFi connections, receive accurate PTP packets over LAN, and knowing when the WiFi beacons arrive, we could send our own PTP packets on a different PTP domain, 50 milliseconds before each beacon and with a timestamp 50 milliseconds in the future, so that those packets are sent out pretty accurately. Unfortunately the TSF function won't help us do that, because look at this:

        int64_t time=esp_wifi_get_tsf_time((wifi_interface_t)WIFI_IF_STA);
        delay(2);
        int64_t time2=esp_wifi_get_tsf_time((wifi_interface_t)WIFI_IF_STA);
        csprintf("tsf time: %llu 2nd diff: %llu\n",time,time2-time);
Result:

tsf time: 529197402065 2nd diff: 2205
tsf time: 529198401576 2nd diff: 1693
tsf time: 529199402055 2nd diff: 2208
tsf time: 529200402055 2nd diff: 2208
tsf time: 529201402071 2nd diff: 2190
tsf time: 529202402062 2nd diff: 2197

So, TSF is not just updated when the beacon is received, it's maintained locally. Could be useful for other purposes but probably not this one.

Also, even if we could do this more accurately... 80 microseconds!! Holy hell man, what are you trying to do? :)

Also, any GPS module is probably capable of spitting out a timestamp, that's part of the NMEA protocol. If they also have a 1 PPS pulse, you should be golden.

PortalHearts commented 1 year ago

Commenting on my previous update about mesh_tsf -> esp_mesh_get_tsf_time() . Just to make sure my suspicions I ran esp_wifi_get_tsf_time() in a loop and then turned off the router mid loop and the time would print to zero signifying it is pulling it from the router. However I ran esp_mesh_get_tsf_time() in a loop and then turned the router off mid loop and the time kept printing indicating it is not retrieving from the router. Therefore I don't know how reliable esp_mesh_get_tsf_time() is. I have not looked at the actual kernel code yet to so I don't know what these functions are actually doing under the hood. I do have sub millisecond synchronization now just not the 80 I want. Some success.

I know PCAP analyzers on certain systems show the hardware stamp such as in Windows.

Therefore I don't know if esp_wifi_set_promiscuous_rx_cb() contains any hardware stamps that can be cross referenced with tsf_time. I haven't investigated yet. Will at some point.

Update : I haven't cross referenced esp_wifi_set_promiscuous_rx_cb() to TSF but I did get the sync down to about 30 microseconds using TSF and TSF mesh. So for all intents and purposes my objective has been solved.

kubark42 commented 1 year ago

@PortalHearts fascinating! I, too, have a system where I would like to get sub ms synchronization. In my use case, I'd like to sprinkle accelerometer sensors around a drivetrain to locate the cause of a troublesome vibration. The sensors need to sync up so they can all be in sync with a timing pulse, but running wires through firewalls is not advisable so wireless would be a wonderful way to do it. The signals I'm looking at are no faster than 10ms, so a jitter of less than 1ms would be enough for my pursposes.

From my understanding of the above, it sounds like you have solved this, possibly by using/modifying @leifclaesson's excellent library. Any chance you could provide the code you used to get 30us syncing? I'd like to give that a shot and see how consistent it is across four or five devices. Maybe blink an LED and measure the response with a logic analyzer.

PortalHearts commented 1 year ago

Use esp_wifi_get_tsf_time() or esp_mesh_get_tsf_time() after a WiFi.h connection. (My devices are connected to my Wifi router. An ESP is not acting as wifi hub although I need to test that later. Furthermore it was also tested on a laptop hotspot using 'netsh wlan hostednetwork' in Windows and works the same as a router.

LogicAnalyzer sounds good. You should try it. I may. What about wire splitting?

You do automotive engineering? Do you have a favorite digital twins environment/stack ?

kubark42 commented 1 year ago

Use esp_wifi_get_tsf_time() or esp_mesh_get_tsf_time() after a WiFi.h connection.

So literally

#include <WiFi.h>
#include <esp_mesh.h>

// Replace with your own network credentials
const char* ssid = "yourNetworkSSID";
const char* password = "yourNetworkPassword";

void setup(){

    Serial.begin(115200);

    WiFi.mode(WIFI_STA);
    WiFi.begin(ssid, password);
    Serial.print("Connecting to WiFi Network ");
    Serial.println(ssid);

    while(WiFi.status() != WL_CONNECTED){
        delay(100);
    }

   esp_mesh_get_tsf_time();
}

void loop(){
    // Do Nothing
}

should do it? That looks too easy by half.

I'm doing some aviation work. But regarding digital twins, check out Istari.

PortalHearts commented 1 year ago

Code looks right. However if you want to wonder way down to get something highly accurate consider the repos wifi-ptp and linuxptp. It's not certain what esp_mesh_get_tsf_time() is doing under the hood. Apparently you are supposed to smooth out the time. If you need nanoseconds you will have to try those other repos. (I'm assuming nanoseconds with those repos as the word 'nanosecond' is mentioned many times in the code.)

Istari looks interesting. I'll check it out. I'm into digital twins for industrial automated manufacturing.

kubark42 commented 1 year ago

Nanoseconds is way faster than I think can be done even with effective GSP PPS. In a physical process which has oscillations in the 30-80Hz range, I feel like 100us is two orders of magnitude better than my fastest rate, which means a delay of less than 1%. I feel like that's pretty good.

... industrial automated manufacturing

I'm totally hijacking @leifclaesson's issue here, but this is the bane of my existence right now. It's so frustrating to have CAM be an incredible barrier to doing subtractive manufacturing. Whether there will be a head crash or not is completely deterministic, and it's crazy that we're still using software like GibbsCAM to program our CNCs. AI is absolutely a killer app for making CNC machining as easy and safe as 3D printing.

PortalHearts commented 1 year ago

@kubark42 GibbsCAM hmmm. I need my digital twins to look and feel as if it were generated by an Unreal or Unity engine. Throw in Nvidia Omniverse integration perhaps. I need proper physics and sims for every single process on the assembly floor. I see many startups touting their products but not their manufacturing plants.