labstreaminglayer / App-LabRecorder

An application for streaming one or more LSL streams to disk in XDF file format.
MIT License
128 stars 45 forks source link

How does LSL log time_stamps #101

Closed CalebUAz closed 1 year ago

CalebUAz commented 1 year ago

Hi,

This is not exactly an issue but I was just curious how lsl logs time_stamp.

I wrote a simple script:

from pylsl import StreamInfo, StreamOutlet, local_clock

# create a StreamInfo instance
info = StreamInfo('my_stream', 'EEG', 8, 100, 'float32', 'myuid34234')

# create a StreamOutlet
outlet = StreamOutlet(info)

# send some data with a timestamp
data = [1, 2, 3, 4, 5, 6, 7, 8]
timestamp = local_clock()
outlet.push_sample(data)

Recorded the message using Lab Recorder and loaded the xdf file:

import pyxdf
data, header = pyxdf.load_xdf('~/sub-P001_ses-S001_task-Default_run-001_eeg.xdf')

when I print data I get:

[{'info': defaultdict(list,
              {'name': ['my_stream'],
               'type': ['EEG'],
               'channel_count': ['8'],
               'channel_format': ['float32'],
               'source_id': ['myuid34234'],
               'nominal_srate': ['100.0000000000000'],
               'version': ['1.100000000000000'],
               'created_at': ['2526133.227623875'],
               'uid': ['0-0-0-0-0'],
               'session_id': ['default'],
               'hostname': ['caleb'],
               'v4address': [None],
               'v4data_port': ['0'],
               'v4service_port': ['0'],
               'v6address': [None],
               'v6data_port': ['0'],
               'v6service_port': ['0'],
               'desc': [None],
               'stream_id': 1,
               'effective_srate': 100.00000223517424}),
  'footer': {'info': defaultdict(list,
               {'first_timestamp': ['2526681.080187375'],
                'last_timestamp': ['2526681.080187375'],
                'sample_count': ['0'],
...
                                            'value': ['-0.0001120420638471842']})]})]})},
  'time_series': array([[0.0000000e+00, 9.6266402e-41, 1.7154398e-34, 2.9218968e-30,
          9.6983867e-41, 6.8617591e-34, 4.6751889e-29, 9.7701332e-41]],
        dtype=float32),
  'time_stamps': array([2526681.08011613])}]

the time_stamps doesn't seem to be Unix time. Why is that?

cboulay commented 1 year ago

LSL uses std::chrono::steady_clock.

CalebUAz commented 1 year ago

@cboulay so how do I get the unix time?

dmedine commented 1 year ago

Are you using a Windows machine? The switch to steady_clock is long since I was doing any significant work on LSL---and I don't know how it shakes out (not in the documentation)---but it used to be that at the bottom of local_clock, was the number of seconds since the Epoch on Unix systems and since the last boot on Windows. Your timestamps correspond to about 23 days which isn't an unreasonable time for a computer to go without a restart.

dmedine commented 1 year ago

It is worth noting that LSL doesn't really 'log' time and is not capable of synchronizing data to times of day. Timestamps are used to synchronize data streams, not to attach them to absolute moments in the calendar.

cboulay commented 1 year ago

I'm not sure why but this is the 3rd or 4th time in the past year that someone has expected to be able to synchronize an XDF file with a separately recorded file, assuming that both had unix timestamps. I'm starting to think that LabRecorder, on systems where std::chrono::steady_clock::now().time_since_epoch() isn't referring to 1970-1-1, should add its own stream where the payload is an int64 containing Unix time in some fractional time unit (e.g. microseconds). This will have all the usual flaws that wall clock time has (i.e., not guaranteed to be monotonic), which is why we wouldn't use it by default, but it's probably better than whatever else the user will cook up to try to "recover" these recordings (e.g., using file datetimes).

Edit: I also think that xdf importers should automatically subtract the lowest common timestamp (post-synchronization) from all streams' timestamps. If the numbers start at t=0 then I think there will be a lot less confusion about this.

cboulay commented 1 year ago

Sorry, @CalebUAz , I shouldn't assume what you're trying to do. What is your use case for wanting LSL data with Unix timestamps?

cboulay commented 1 year ago

The official docs also have a Time Synchronization document but this has been orphaned. I just made a small update to its introduction, but it's still orphaned and doesn't appear in the readthedocs rendering. Maybe this will help you or future readers of this thread.

@CalebUAz , I'm happy to continue discussing, but I think your question is answered. If you agree then please close this issue.

tstenner commented 1 year ago

Seconds since the unix epoch can be non-monotonic with NTP time synchronization in the background.

A secondary stream with unix timestamps could be a building block to synchronize different files, as long as jumps in the difference between the steady_clock and the secondary clock are detected. Small suggestion: decide on a unit and send uint64 timestamps. The precision for double is getting worse and worse with each year and ints are computationally cheaper.

cboulay commented 1 year ago

I added a specific section on manually synchronizing to external clocks to the Time Synchronization docs here.

dmedine commented 1 year ago

I just compiled this code on Windows 10 (modified from the cpp reference page for chrono):

#include <chrono>
#include <iostream>

int main(void)
{
    const auto p1 = std::chrono::system_clock::now();
    const auto p2 = std::chrono::steady_clock::now();

    std::cout << "system_clock seconds since epoch: "
        << std::chrono::duration_cast<std::chrono::seconds>(
            p1.time_since_epoch()).count()
        << '\n';

    std::cout << "steady_clock seconds since epoch: "
        << std::chrono::duration_cast<std::chrono::seconds>(
            p2.time_since_epoch()).count()
        << '\n';

    return 0;
}

Output:

system_clock seconds since epoch: 1690157550
steady_clock seconds since epoch: 434342

So on Windows anyway, steady_clock will give the time since reboot and system_clock will give Unix time. I am not so sure that the NTP adjustments to system_clock are necessarily showstoppers. In practice nearly all streams have negative timestamps differences prior to dejittering due to chunking and hardware buffering anyway. It is annoying that time_since_epoch doesn't actually do that on Windows.

Also, from the docs (https://en.cppreference.com/w/cpp/chrono/steady_clock): "This clock is not related to wall clock time (for example, it can be time since last reboot), and is most suitable for measuring intervals." Maybe this warrants a config option to choose. C++20 has 3 more additional clocks as well.

dmedine commented 1 year ago

I posted something about this on Stack Overflow and it got the attention of Howard Hinnant (the author of std::chrono). Scroll down to the comments in the answer for his insights (https://stackoverflow.com/questions/76750887/why-does-steady-clocknow-time-since-epoch-count-return-time-since-reboot?noredirect=1#comment135310469_76750887). This makes me even more inclined to suggest a config file option for specifying the LSL base clock implementation. The default can stay on steady_clock so only power users need know about it.

Both clocks have their pros and cons, depending on the needs of the user and the circumstances. Seems like a good enough reason to make it optional. It wouldn't require much code.

tstenner commented 1 year ago

The naive approach (api_config::get_instance()->clock_type == lsl_clock_type::steady ? std::steady_clock::now() : std::system_clock::now()) adds an atomic read, a pointer redirection, a branch and probably a function call. That's a lot for a change that should be rarely used with extreme caution.

CalebUAz commented 1 year ago

@cboulay @tstenner if my understanding is correct every time Aurora stream fNIRS data and when LabRecorder packs it into an XDF file, for every sample it uses std::chrono::steady_clock as the timestamp? The datatype is double but does Python do any truncation or append any zeros of any sort?

Also, for LSL does MacOS use a different C++ library compared to Linux?

cboulay commented 1 year ago

every time Aurora stream fNIRS data and when LabRecorder packs it into an XDF file, for every sample it uses std::chrono::steady_clock as the timestamp?

It's hard to know without looking at the source code for the Aurora application. Ideally, yes that is correct. You can kind of test this by writing a small application that creates an inlet for your fNIRS stream (no postprocessing) and prints the pulled timestamps. In the same application, print the output from local_clock(). If the two values are very close to each other then that should indicate the Aurora LSL integration is using the correct clock for timestamps.

The datatype is double but does Python do any truncation or append any zeros of any sort?

Are you asking about loading the XDF or on the inlet? In either case, no, I don't think Python modifies the timestamp.

Also, for LSL does MacOS use a different C++ library compared to Linux?

Building liblsl on Mac vs Linux uses different compilers (clang vs gcc) and different standard libraries.

dmedine commented 1 year ago

Aurora is written in Python. I would be surprised if they are not using a pretty contemporary version of pylsl. I would also bee surprised if they are getting the timestamps wrong. Their development team is quite good. You can also write to their technical support and ask them to check if you have any specific questions.

Python marshalls the timestamp values as double precision. I assume all the XDF loaders all do the same (certainly the Matlab loader does). Of course, you will never have synchronization accuracy > 1ms anyway, so 32 bit precision is all you need.

CalebUAz commented 1 year ago

@cboulay I did a test where I used LabRecorder and another one where I used pylsl to read streams:

from pylsl import StreamInlet, resolve_stream

streams = resolve_stream('type','Gaze')
inlet = StreamInlet(streams[0])
sample, timestamp = inlet.pull_sample()

timestamp gave 85358.950462127 this looks like std::chrono::steady_clock.

I also read the xdf file:

  'footer': {'info': defaultdict(list,
               {'first_timestamp': ['3445.228907616'],
                'last_timestamp': ['3450.329403616'],
                'sample_count': ['1245'],
                'clock_offsets': [defaultdict(list,
                             {'offset': [defaultdict(list,
                                           {'time': ['3450.972303390503'],
                                            'value': ['1690311363.590879']})]})]})},
  'time_series': array([[ 0.        ,  0.05039496,  0.00952302, ...,  0.        ,
                  nan, -0.0635128 ],
         [ 0.        ,  1.57713605,  0.89205641, ...,         nan,
          -0.0635128 ,         nan],
         [ 0.        ,  0.05039496,  0.00952302, ...,  0.        ,
                  nan, -0.0635128 ],
         ...,
         [ 0.        ,  1.57713605,  0.89205641, ...,         nan,
          -0.0635128 ,         nan],
         [ 0.        ,  0.05039496,  0.00952302, ...,  0.        ,
                  nan, -0.0635128 ],
         [ 0.        ,  1.57713605,  0.89205641, ...,         nan,
          -0.0635128 ,         nan]]),
  'time_stamps': array([1.69031481e+09, 1.69031481e+09, 1.69031481e+09, ...,
         1.69031481e+09, 1.69031481e+09, 1.69031481e+09])}]

Why does XDF time_stamps provide unix time and pylsl give me steady_clock?

cboulay commented 1 year ago

If the LabRecorder machine was different than the Gaze stream origin's machine, the timestamps will be in different base clocks. The xdf importers automatically convert all timestamps into the same base clock -- that of the LabRecorder machine. In pylsl, it gives you the timestamp in the origin's clock unless you specifically set the inlet's postprocessing flags.

CalebUAz commented 1 year ago

@cboulay I performed the same test on MacOS and Linux where I simulated a stream on MacOS and Linux and recorded it using LabRecorder on MacOS and Linux. Both are independent of each other.

Script:

from pylsl import StreamInfo, StreamOutlet, local_clock

# create a StreamInfo instance
info = StreamInfo('my_stream', 'EEG', 8, 100, 'float32', 'myuid34234')

# create a StreamOutlet
outlet = StreamOutlet(info)
data = [1, 2, 3, 4, 5, 6, 7, 8]
outlet.push_sample(data)

Behavior on MacOS is different from Linux:

MacOS:

'footer': {'info': defaultdict(list,
               {'first_timestamp': ['442298.823534041'],
                'last_timestamp': ['442304.967350125'],
                'sample_count': ['4'],
...
         [1., 2., 3., 4., 5., 6., 7., 8.],
         [1., 2., 3., 4., 5., 6., 7., 8.],
         [1., 2., 3., 4., 5., 6., 7., 8.]], dtype=float32),
  'time_stamps': array([442298.80023438, 442300.26401219, 442301.72779001, 442303.19156783,
         442304.65534565])}]

Linux:

'footer': {'info': defaultdict(list,
               {'first_timestamp': ['11214023.11861275'],
                'last_timestamp': ['11214031.47040489'],
                'sample_count': ['3'],
                'clock_offsets': [defaultdict(list,
                             {'offset': [defaultdict(list,
                                           {'time': ['11214025.08733511'],
                                            'value': ['1679185248.927651']}),
                               defaultdict(list,
                                           {'time': ['11214030.08742452'],
                                            'value': ['1679185248.92765']}),
                               defaultdict(list,
                                           {'time': ['11214035.08753133'],
                                            'value': ['1679185248.92765']})]})]})},
  'time_series': array([[1., 2., 3., 4., 5., 6., 7., 8.],
         [1., 2., 3., 4., 5., 6., 7., 8.],
         [1., 2., 3., 4., 5., 6., 7., 8.],
         [1., 2., 3., 4., 5., 6., 7., 8.]], dtype=float32),
  'time_stamps': array([1.69039926e+09, 1.69039927e+09, 1.69039928e+09, 1.69039928e+09])}]

Why do they have different time_stamps (monotonic timestamps while the linux version has unix timestamps)?

cboulay commented 1 year ago

They're both monotonic. On Linux, the reference point is the Unix epoch. On MacOS, if the compiler was clang, the reference point is the last reboot (I think).

So how does Linux steady_clock use UNIX epoch as its reference while still being monotonic? To be honest, I don't know. I'm sure we could find out. Anyway, in practice, I think this means that if there's an NTP update that would cause the clock to go backwards, steady_clock keeps returning the same value over and over until enough time has elapsed to pass the previous high-point before the NTP update.

Anyway, I should ask again: Why do you need UNIX time? Most of the time the absolute reference doesn't matter. You should only need to care about the streams relative to each other, not relative to the wall clock.

CalebUAz commented 1 year ago

Thanks for clarifying @cboulay

Anyway, I should ask again: Why do you need UNIX time? Most of the time the absolute reference doesn't matter. You should only need to care about the streams relative to each other, not relative to the wall clock.

Well, our hyperscanning experiments use iMacs to stream Gaze, Windows PC to stream EEG, and Windows laptop to stream fNIRS data. All these Physio streams are streamed using LSL and saved in a Linux server that runs Labrecorder.

We were worried if LSL or LabRecorder was doing any sort of conversion of timestamps.

dmedine commented 1 year ago

LabRecorder only records. It records both timestamped data and clock time differences (using whatever implementation of steady_clock exists on the computer you are streaming from to fetch the timestamps) that are intended to synchronize streams across PCs at load time. When you load data, the loading programs (load_xdf.m, for example) use the native timestamps and the clock offsets to synchronize. This is all explained in greater detail in the documentation: https://labstreaminglayer.readthedocs.io/info/time_synchronization.html

chkothe commented 1 year ago

A few additional comments on those points. So as Chad and David already indicated, one can't simply use the system time to synchronize multiple streams, for a couple of reasons. One is that the system time may be off on a given machine (some OSes may update that at some random point during the day, or not at all, e.g. if the ports to call out to an NTP server are blocked), or even in the right time zone (e.g. as it happens when switching between dual-boot OSes). But even when it is being actively synched, one can't assume it to be accurate down to the millisecond since the nearest NTP server is likely several hops away somewhere on the internet. And even if it were accurate, that system time would run slower or faster or (for some clocks) even jump backwards as those NTP sync events (and leap seconds etc.) are happening. But even if all that was working, on some OSes that clock simply doesn't have sufficient resolution for high sampling rate streams (e.g., it used to be 10ms on Windows back when LSL was created, and probably still is today). So if you want to synchronize time accurately across machines, you need to exchange packets between them to determine the relative clock offset, and you need to keep repeating that throughout the recording since those clocks are going to drift, as all clocks do, and they do that by enough to matter.

So since you are going to measure clock offsets anyway, you are free to use any clock you like as long as it is well-behaved (i.e., has sufficient resolution, is monotonic, and doesn't change the rate mid-recording due to external events). So fortunately PCs (and other devices) have for a long time had high-resolution performance counters on the CPU that are pretty high resolution (megahertz rates), stable, monotonic, and to answer your other question, that are fast enough to read out without a noticeable performance impact. There are multiple of those counters, they tend to natively measure the uptime of the machine, and they are exposed via various APIs on different OSes. Some OSes (I guess Linux) might also do some surgery and combine the timers with a normal clock to get something that's high resolution and approximately synchronized with the wall clock (but again, the more surgery there is, the higher the chance that the clock is going to do something subtly funky occasionally, perhaps in the middle of a recording). So to sum up, far as LSL is concerned, it doesn't really matter which of these counters a given OS exposes since the offset between machines is being measured by LSL anyway. Although to be fair, the numerics are a bit simpler if the readout between different machines doesn't differ by 63 billion as it would be if one was reporting UNIX time and another was reporting uptime, and for that reason in an ideal world LSL would report one of the two on all platforms, and of those uptime would be the less funky of the two in terms of what exactly it's doing when an NTP sync happens and the one that should be available on all platforms. So that's why LSL originally tried to use uptime across everything, although I think over the years and across OSes UNIX time was used here and there.

So that's synchronization. Perhaps the other question that might be implied by the original post is, where do I find a record of when my data file was collected? And I think that's something that one could address maybe with another XDF header field that stores that, e.g., as an ISO-formatted date time (could be in the file header and could be written there by the LabRecorder).