Streams with manual timestamps sent by LSL to Labrecorder are incorrectly loaded from .xdf

varjak commented 2 years ago

I have a python script that sends two streams (signal + markers) through LSL with manual timestamps to Labrecorder, all with one (windows) computer. When I load the .xdf file in MATLAB (with load_xdf.m), the signal timestamps sometimes change and the signal become shifted from the markers.

I found that if I call load_xdf('file.xdf', 'HandleClockSynchronization', 0), the timestamps are loaded without change, and signal + markers are alligned. Should I just do this, or are there any disadvantages of not doing the load_xdf clock synchronization step?

From this issue OpenBCI/OpenBCI_GUI#775, I got that I could also:

convert the manual timesptamps to the local LSL clock first, I assume in the python script, before they are pushed. How would I do that?
not push the values (data / markers) with the timestamps, so do push_sample(array_of_values) instead of push_sample(array_of_values, array_of_timestamps). But as I know the sampling frequency of the array_of_values, I think I would prefer to manually generate the array_of_timestamps, without relying at the actual times they were sent.

dmedine commented 2 years ago

Manually timestamping your data streams will have this effect. This is because the XDF file records clock offsets between the recording PC's timestamps and the streaming PC timestamps, which are different than your manual timestamps. The 'HandleClockSyncrhonization' computes the offset between those PC timestamps, and maps the recorded (in your case manually imposed) timestamps to the recording PC's time line.

Without knowing the details of your setup, I cannot say whether or not skipping this step poses any disadvantage, but in general it is not good practice to manually timestamp data.

The main application is for manually adding timestamps to individual samples in chunks of data. I believe this touches on your second bullet point. For example, if a device hands data in chunks of 10 samples, you would take the timestamp (which you would get with local_clock) for when every other chunk of data arrives from the device and then manually timestamp each sample with sample_number * sampling_rate + chunk_timestamp (see this example: https://github.com/labstreaminglayer/liblsl-Python/blob/a43ff466383805027298642065580cd18b47ee9b/pylsl/examples/SendDataAdvanced.py#L63-L77). However, in this case, each sample's timestamp will be very near to the streaming PC's timeline (it is already based on it) and so you wouldn't have this inaccurate clock offset issue.

cboulay commented 2 years ago

I'll end up restating a lot of what David said, but maybe add some different pieces of information. Also worth noting before you dive in: We're getting into minutae. Using automatic timestamps will satisfy your timing needs 99% of the time.

For HandleClockSynchronization to work, all timestamps must be in the lsl clock. This is guaranteed when using automatic timestamping.

Edit: I may have mixed up the signs in calculating the offset below. This code is non-functional; conceptual only.

I believe there are 2 criteria that need to be met for you to use manual timestamping: (1) the data source provides timestamps along with its sample data, and (2) the data source provides a way for you to know the offset between its own clock and the LSL clock (e.g. offset = lsl_local_clock() - device->clock_now().to_seconds()). If these 2 criteria are met, then you can indeed get better timestamps with manual timestamping (timestamp = sample_timestamp_dev_clock_secs + offset) than automatic timestamping. In all other cases, you should use automatic timestamping.

There are some alternatives to criterion 2. For example, if the device accepts callbacks, and there is a very consistent delay between the time the last sample in a chunk of buffer is filled and the time the callback for that chunk is executed, then you could assume offset = lsl_local_clock() - last_sample_timestamp - constant_delay. You could calculate constant_delay once using hardware triggers and then hard-code its value into your application.

There's another scenario where manual timestamping might be warranted. That's when you're sending out event strings but you know there is a delay from your call to lsl and when the event actually manifests (e.g., object appears on screen). Here you can do push_sample(data, local_clock() + known_delay_secs). Again, it's something you can measure once with hardware then hardcode.

Compensating for these constant delays is only necessary if you want to compare your evoked potentials' latencies to textbook latencies ("Why is my P300 at 320 msec?"), assuming the textbook values came from a time when everything was hardware synchronized. It is completely irrelevant for BCIs or statistics as long as they can use the full data.

Whether you need to provide timestamps to extra samples in a chunk is a different question. If your device is operating at a known sample rate and the sample interval is reliable, then manually timestamping the extra samples in a chunk is largely unnecessary because LSL timestamps these for you by assigning the last sample in the chunk with the provided timestamp and working backwards assuming the correct interval between samples in that chunk. You also save a very tiny bit of bandwidth this way because instead of transmitting the 64-bit timestamp per sample, it simply transmits a flag to use the inferred stamps. Conversely, if your device does not have a consistent sampling rate and you do not expect consistent intervals between samples in a chunk, then yes you should provide manual timestamps for each sample in a chunk. However, if you find yourself doing this then I'd recommend using push_sample instead of push_chunk because it's conceptually much less complex for a relatively small performance hit.

varjak commented 2 years ago

Thank you for your fast replies. I would like to clarify my setup:

BrainVision Recorder -(RDA)-> Python script -(LSL)-> Labrecorder -> .xdf

Recorder sends data through RDA chunk by chunk. My python script receives a chunk, extracts a matrix of values (points*channels) and an array of markers, adds manual timestamps to both, and pushes the matrix to a signal LSL stream and the array to a marker LSL stream.

Now, I would like to know how I push this data properly to LSL.

Based on your comments, here is what I tried:

push_chunk(mat)
push_chunk(mat, local_clock())
push_chunk(mat, stamps[-1]) % push the last timestamp manually calculated for the matrix
push_sample(vec) % loop the matrix in the "point" direction and push a list of values, one for each channel
push_sample(vec, stamp)

For each, I have loaded the .xdf with and without 'HandleClockSynchronization'.

LSL-1

As you see:

my push_chunk attempts ruin the sinusoid (plotting channel 5) that Recorder is sending. I don't know why.
my push_sample attempts seem to work without manual stamps, and with manual stamps if the .xdf is not loaded with the clock synchronization*

(as we discussed; still on this note, I do not understand why, with the clock sync, the signal and markers become so far apart. They seem to be ~69 minutes apart; the first manual timestamps go from python `time.time() - pointsperiodtotime.time()`, and the rest are added in period increments; each marker just gets the first timestamp of its corresponding chunk; so I don't understand the shift)

To further test the push_sample, I have introduced a delay in my script, between chunks. I was expecting the automatic timestamp pushes to be interruputed, but I am realizing now that maybe they are not because the signal stream knows the expected sampling rate; on the other hand, the marker stream appears to be shifted; and perhaps that is because it has an irregular sampling rate.

LSL-2

So since my python script can introduce delays, the best choice seems to push manual timestamps and not load the .xdf with clock synchronization. Since this is not the generally recommended way, please tell me if there is a better way to push chunks of signal + markers, or if I am misinterpreting something.

Thank you

cboulay commented 2 years ago

BrainVision software has its own LSL streaming capability. Use that. Don't pull the EEG from their device then restream it.

If your Python scripts requires the EEG data to create its markers, then do this:

Setup an inlet to pull the BrainVision stream, but set the inlet's postprocessing flag to at least proc_clocksync, but you might as well set proc_all.
When you do bva_inlet.pull_chunk(), the timestamps you get back will not be in the sending computer's time base, but they will be in the receiving computer's lsl_clock timebase.
If you want your marker to correspond to a particular sample, you are free to use that sample's timestamp with marker_outlet.push_sample(marker_string, sample_timestamp) which is now in the correct time base.

varjak commented 2 years ago

BrainVision software has its own LSL streaming capability. Use that. Don't pull the EEG from their device then restream it.

What do you mean by this? Recorder only streams in RDA. Do you mean this RDA to LSL connector?: https://github.com/brain-products/LSL-BrainVisionRDA. So, say we want to do some online processing in our python script (actually adding markers is not the case since the ones we need already come from Recorder, but e.g. filter the signal); do you recommend using that connector, and set an inlet with that flag, so that data is pulled and timestamped in the receiving lsl clock, and can later be safely pushed?

We haven't tried the connector since, although I only mentioned Recorder, we mainly want to get data out of Recview: BrainVision Recorder -(RDA)-> Recview -(RDA)-> Python script -(LSL)-> Labrecorder -> .xdf

And since the conector is meant for Recorder, we chose the "safer" option of using the python script to receive RDA.

dmedine commented 2 years ago

I think that Cad was referring to https://github.com/brain-products/LSL-BrainVisionRDA. Obviously this won't do any online processing for you. The RDA connector can connect to RecView in theory (some coding required) and at one time this option was available but this feature was removed since RecView can send all kinds of crazy data types like segmented arrays of complex numbers and the complexity of handling all conceivable scenarios in the RDA connector is pretty huge. If you only want to stream plain old EEG data this isn't too big of a hack.

agricolab commented 2 years ago

Since this is not the generally recommended way, please tell me if there is a better way to push chunks of signal + markers, or if I am misinterpreting something.

Regarding your experiment script: Are you pushing markers and data in separate threads or sequential? Waiting for a chunk to be delivered and then sending the marker introduces a delay due to waiting (or vice versa).

Regarding your use case / pipeline, i do not yet understand why you require the resending. Can't you just record with Labrecorder from BrainVision using RDA2LSL:

BrainVision Recorder -(RDA)-> Labrecorder -> .xdf

And if it really necessary to record your online processed results (instead of calculating them offline again): BrainVision Recorder -(RDA)-> Python script -(online-stuff-only LSL)-> Labrecorder -> .xdf

To me this issue looks as if it primarily stems from your processing and streaming scripts. I could potentially give better advice if you share your scripts.

Did i get sth wrong?

sccn / labstreaminglayer

Streams with manual timestamps sent by LSL to Labrecorder are incorrectly loaded from .xdf #99