max_buffered is inconsistent

sccn / lsl_archived

Multi-modal time-synched data transmission over local network

242 stars 134 forks source link

max_buffered is inconsistent #321

Open cboulay opened 5 years ago

cboulay commented 5 years ago

When creating an outlet or inlet, the user can specify the maximum amount of data that can be buffered, in seconds. The data type for this argument is int32_t.

In the non-typed wrappers (Python, Matlab), users can specify floats and these will get typecast and rounded, creating problems. We can make these wrappers check the argument to make sure it is an int > 0, but this intersects with a separate issue.

In some use-cases, especially when a stream is carrying a processed signal, only the latest sample is desired and anything older is to be dropped. In these cases it is convenient to have a buffer of only 1 sample, otherwise the user is forced to fetch until there are no samples remaining then use whatever the last sample was. This has come up several times recently.

The docstring for data_receiver here suggests that max_buflen is number of seconds, as above, and indeed the seconds-version is passed directly to the data_receiver constructor during inlet creation, but if you follow how it's used then it ends up being number of samples. (Gets written to a string, then read in from the string by tcp_server as number of samples).

This is probably a bug. Can other devs please check this to make sure I didn't miss something? @tstenner @mgrivich

Is there a minimum number of samples that must be buffered? I think it is 1, but some of the checks I see look to see if it is >= 0... and 0 causes the stream to be useless because there's no buffer.

Should max_buflen/max_buffered be changed to a float type so < 1 second can be used? Or, probably better, can it be made to represent sample number instead of seconds?

In either case, this will require most apps to be changed.

dmedine commented 5 years ago

I don't have time right now to modify the source code and check what the minimum number of samples in the buffer can be---probably 1 is ok but what might happen is that the connection will break off and have to reconnect. I don't know ASIO well enough to make an informed guess, though. Regardless, this buffer must be greater than 0.

It would be a drag to have to know the sampling rate for this parameter, so I think changing it to samples is not the best option. Perhaps the correct thing to do is to overload the constructor so that it could be specified as an int or a float. That way if people really wanted to, they could get the less-than-1-second buffers that they want. There should at least be a warning if <=0 is specified, because such an inlet will never work.

My two cents though is not to bother changing this at all. LSL always pushes data in as timely a manner as it can. This parameter doesn't affect throughput or performance. So, if samples are getting backlogged that means there is something that isn't LSL's 'fault' that is blocking transmission. In this case your options are either to have old data or no data. I would always opt for old. The work around is programming a timeout capability in the data acquisition loop. This isn't easy, but this might be a case where the burden should be put on the user rather than the developer of the library. 1 second of data is not that much except in extreme cases that are probably too distant to warrant much development effort/support.

On 7/23/2018 7:51 PM, Chadwick Boulay wrote:

outlet https://github.com/sccn/labstreaminglayer/blob/master/LSL/liblsl/include/lsl_c.h#L468-L474 inlet https://github.com/sccn/labstreaminglayer/blob/master/LSL/liblsl/include/lsl_c.h#L614-L628

When creating an outlet or inlet, the user can specify the maximum amount of data that can be buffered, in seconds. The data type for this argument is int32_t.

In the non-typed wrappers (Python, Matlab), users can specify floats and these will get typecast and rounded, creating problems. We can make these wrappers check the argument to make sure it is an int > 0, but this intersects with a separate issue.

In some use-cases, especially when a stream is carrying a processed signal, only the latest sample is desired and anything older is to be dropped. In these cases it is convenient to have a buffer of only 1 sample, otherwise the user is forced to fetch until there are no samples remaining then use whatever the last sample was. This has come up several times recently.

The docstring for |data_receiver| here https://github.com/sccn/labstreaminglayer/blob/master/LSL/liblsl/src/data_receiver.cpp#L20-L21 suggests that |max_buflen| is number of seconds, as above, and indeed the seconds-version is passed directly to the data_receiver constructor during inlet creation, but if you follow how it's used then it ends up being number of samples. (Gets written to a string, then read in from the string by tcp_server as number of samples).

This is probably a bug. Can other devs please check this to make sure I didn't miss something? @tstenner https://github.com/tstenner @mgrivich https://github.com/mgrivich

Is there a minimum number of samples that must be buffered? I think it is 1, but some of the checks I see look to see if it is >= 0... and 0 causes the stream to be useless because there's no buffer.

Should |max_buflen|/|max_buffered| be changed to a float type so < 1 second can be used? Or, probably better, can it be made to represent sample number instead of seconds?

In either case, this will require most apps to be changed.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sccn/labstreaminglayer/issues/321, or mute the thread https://github.com/notifications/unsubscribe-auth/ADch7md4FYLphZSfWNCj2zil1FyBM8Phks5uJg0zgaJpZM4VbbHh.

tstenner commented 5 years ago

In these cases it is convenient to have a buffer of only 1 sample, otherwise the user is forced to fetch until there are no samples remaining then use whatever the last sample was.

Pull everything in a chunk and take only the interesting samples? We already need to break the API for the next version, so we might as well think about other solutions, i.e. adding a function to drop the all samples except for the newest N.

The docstring for data_receiver here suggests that max_buflen is number of seconds, as above, and indeed the seconds-version is passed directly to the data_receiver constructor during inlet creation, but if you follow how it's used then it ends up being number of samples.

If that's the case, I'd correct the docs and be happy the (imho) more sane interpretation for the parameter is correct.

Should max_buflen/max_buffered be changed to a float type so < 1 second can be used? Or, probably better, can it be made to represent sample number instead of seconds?

+1 for number of samples, changing the type would require a recompilation for all apps otherwise they would push an int which would then get converted to about 0.

tstenner commented 5 years ago

I don't have time right now to modify the source code and check what the minimum number of samples in the buffer can be---probably 1 is ok but what might happen is that the connection will break off and have to reconnect. I don't know ASIO well enough to make an informed guess, though. Regardless, this buffer must be greater than 0.

I think either all samples get transmitted or they don't even reach the network level, so asio shouldn't care.

Perhaps the correct thing to do is to overload the constructor so that it could be specified as an int or a float

That will cause more problems than it solves and float

It would be a drag to have to know the sampling rate for this parameter

But number of samples works in all cases and is a lot easier to reason about in terms of speed and resource usage. Also the buffer doesn't work in seconds but rather in number of samples, so it would have to be translated somewhere anyway.

The work around is programming a timeout capability in the data acquisition loop.

Or a 'drop everything older than the last XY samples' function which would be quite easy.

cboulay commented 5 years ago

It would be a drag to have to know the sampling rate for this parameter

But number of samples works in all cases and is a lot easier to reason about in terms of speed and resource usage. Also the buffer doesn't work in seconds but rather in number of samples, so it would have to be translated somewhere anyway.

For the outlet this is easy because it (should!) know its sampling rate, if any. I think @dmedine was referring to the consumer/inlet, where you don't necessarily care what the sampling rate is at inlet creation time, and it is more likely that when you think about insurance against dropped connections that you're thinking in time, not samples. But I think that anyone that actually specifies this value will also be able to query the sampling rate.

It really bothers me that 'time' is an int, especially when this int (should!) always get multiplied with sampling rate. What happens when the sampling rate is not a round number (I've seen devices like this)? Having a buffer that's a fraction of a second shorter than the specified buffer ~length~duration is probably harmless. What happens when the sampling rate is less than 1.0? Edit: If max_buffered=1 then the stream is never resolvable on the network.

It also bothers me that it is called max_buflen on the inlet, but it's not a length, it's a duration... and it's right before another argument called max_chunklen which IS a length (in number of samples).

Pull everything in a chunk and take only the interesting samples?

That is only guaranteed to work if the max chunk size is greater than the buffer size, otherwise you have to pull chunks until you come up empty then use the last N samples of the last non-empty chunk. This is a pretty common anti-pattern that real-time users are required to do.

+1 for number of samples, changing the type would require a recompilation for all apps otherwise they would push an int which would then get converted to about 0.

Even if we don't change the type, the outlet apps that have specified a buffer duration did so with the expectation that that it would guard against connection interruptions... but if we go from duration to n-samples then the buffer would end up being much much shorter than expected and would probably underrun at the slightest interruption. That being said, most apps that use the C++ API opt not to provide this argument so it defaults to 360 seconds, and we can simply change the default to 0 which gets caught during creation and is modified to int(360*fs) which is functionally equivalent to what happens now. So it's quite possible that any apps that didn't explicitly set the buffer duration will not have to be recompiled.

I think he's still on vacation, but I will try to get @chkothe 's input when he gets back.

tstenner commented 5 years ago

Pull everything in a chunk and take only the interesting samples?

That is only guaranteed to work if the max chunk size is greater than the buffer size, otherwise you have to pull chunks until you come up empty then use the last N samples of the last non-empty chunk. > This is a pretty common anti-pattern that real-time users are required to do.

Full agreement here, but that's the easiest way to do it at the moment.

That being said, most apps that use the C++ API opt not to provide this argument so it defaults to 360 seconds, and we can simply change the default to 0 which gets caught during creation and is modified to int(360*fs) which is functionally equivalent to what happens now.

Default arguments only change on compilation time, so for version n+1 we could output a warning if the parameter is 360 and remove the warning in n+2 in the hope that everyone has caught up.

cboulay commented 5 years ago

BTW, I just tested in Python creating an outlet with max_buffered=1 using either FS=0.9 or FS=1.0. The 0.9 never becomes visible to the inlet but the 1.0 works perfectly.

import pylsl
import time
import numpy as np

FS = 0.9
NCHANNELS = 8

info = pylsl.StreamInfo(name='Test', type='EEG', channel_count=NCHANNELS, nominal_srate=FS,
                        channel_format='float32', source_id='myuid34234')

# next make an outlet
outlet = pylsl.StreamOutlet(info, max_buffered=1)

print("now sending data...")
n_pushed = 0
t_start = time.time()
while True:
    time.sleep(max([0, (n_pushed / FS) - time.time() + t_start]))
    outlet.push_sample(np.random.rand(NCHANNELS))
    n_pushed += 1

dmedine commented 5 years ago

BTW, I just tested in Python creating an outlet with max_buffered=1 using either FS=0.9 or FS=1.0. The 0.9 never becomes visible to the inlet but the 1.0 works perfectly.

Because .9 gets rounded to 0 and then when the tcp_server is constructed it has a buffer size of 0 samples.

Reply to this email directly, view it on GitHub https://github.com/sccn/labstreaminglayer/issues/321#issuecomment-407443526, or mute the thread https://github.com/notifications/unsubscribe-auth/ADch7iV8sO2xo1SyHjbl6qFrKg1e3mp7ks5uJznegaJpZM4VbbHh.

dmedine commented 5 years ago

Oops! The above statement is incorrect. If this parameter is 0, the tcp_server is simply never created (https://github.com/sccn/labstreaminglayer/blob/master/LSL/liblsl/src/tcp_server.cpp#L396-L398)

dmedine commented 5 years ago

Let me also clarify something that appears to be confused (though maybe it's just me that's confused here). The docs are correct. The time is specified in seconds and is converted to samples here: https://github.com/sccn/labstreaminglayer/blob/master/LSL/liblsl/src/lsl_inlet_c.cpp#L27.

Chadwick is correct that this gets turned into a string so that the information can get piped into the creation of the tcp_server later down the line, which is where the actual allocation of this memory is actually performed (on the sending side, by the way).

By the way, this has been bugging me all day so I changed my mind about not having enough time and quickly checked this. I changed the line cited above so that it doesn't multiply by the nominal sampling rate when the inlet is created. I tested it using the cpp example programs and you can indeed specify the buffer size to 1. I only ran it for a few minutes, but it does not appear to affect performance in terms of memory or CPU usage, but of course there are many, many implications for ways that doing this can hurt you---as I'm sure @chadwick and @tstenner are quite well aware :)

dmedine commented 5 years ago

I also like @tstenner's suggestion that we add a function to somehow take the newest N samples on demand if requested. This is tricky to implement, and I don't want to get in the habit of adding a whole bunch of patchy features to address many different special cases, but I feel quite hesitant to change this---even though, in truth, I would guess that the majority of people that are actually specifying a non-default value here are people who want a 1 sample buffer length.

If this is in fact true, a simpler (but kludgier) solution is to add an optional argument to the inlet constructor to override the buffer length and force it to be 1 sample.

cboulay commented 5 years ago

RE the Python example, I guess my point was that this is a mistake that a user could conceivably make and there are currently no warnings or safe guards against it. IMO this needs to be fixed in the core, not the wrapper. If we're modifying the core lib anyway, then let's fix it the right way, as soon as we agree what that is.

If we leave it as int-seconds, then we need to fix and clarify some docstrings, raise errors when the buffer is 0, fix the problem with data_receiver, and it would be helpful to add a pull_latest_sample_<> for each of the pull_sample_<> functions.

Data receiver gets initialized with max_buflen seconds in the stream_inlet_impl here. data_receiver constructor declaration, constructor implementation where the buffer_len argument is supposed to be in seconds but here is passed directly to the sample_queue_ initializer, which is an instance of consumer_queue, and its constructor expects number of samples. Anyway, all of the above is correct (except for the doc strings and we should fixup all of the inconsistent argument names) because the number that actually gets passed to the stream_inlet_impl is in fact n-samples.

If we go with int-samples, then there are a bunch of externally-facing things to change, but it makes the details of the implementation much simpler. And as you were saying, if anyone is interested in using a non-default value then they should be proficient enough to be able to do other things like query the sampling rate. Edit: Not quite what you said, but I was referring to the proficiency of these power users.

cboulay commented 5 years ago

Also, optional arguments aren't a thing in C functions, so anything that uses the C API will need to provide these arguments explicitly. That includes at least the Python and Matlab wrappers.

mgrivich commented 5 years ago

On 7/23/2018 10:51 AM, Chadwick Boulay wrote:

use-cases, especially when a stream is carrying a processed signal, only the latest sample is desired and anything older is to be dropped. In these cases it is convenient to have a buffer of only 1 sample, otherwise the user is forced to fetch until there are no samples remaining then use whatever the last sample was. This has come up several times recently. Setting a buffer to only 1 sample seems like a very bad idea. If LSL throws away data to make room for new data, I would call that a critical bug, not a feature that should be used.

I don't see the problem with the standard method of pulling till timestamp is zero.

The docstring for |data_receiver| here https://github.com/sccn/labstreaminglayer/blob/master/LSL/liblsl/src/data_receiver.cpp#L20-L21 suggests that |max_buflen| is number of seconds, as above, and indeed the seconds-version is passed directly to the data_receiver constructor during inlet creation, but if you follow how it's used then it ends up being number of samples. (Gets written to a string, then read in from the string by tcp_server as number of samples).

This is probably a bug. Can other devs please check this to make sure I didn't miss something? @tstenner https://github.com/tstenner @mgrivich https://github.com/mgrivich

Is there a minimum number of samples that must be buffered? I think it is 1, but some of the checks I see look to see if it is >= 0... and 0 causes the stream to be useless because there's no buffer.

Should |max_buflen|/|max_buffered| be changed to a float type so < 1 second can be used? Or, probably better, can it be made to represent sample number instead of seconds?

In either case, this will require most apps to be changed.

—

I'd say any buffer of less than a second is unwise, because networks can hick up. And before we encourage/allow very short buffers we would need to set up tests in toxic situations to see how LSL fails with small buffers and lag spikes.

Right now, I'd say just fix the wrappers so that they don't allow the user to request not integer values for buffer.

You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sccn/labstreaminglayer/issues/321, or mute the thread https://github.com/notifications/unsubscribe-auth/AFC33X6EF98N-XUKNhG8DcLUc4VClT1Pks5uJg0zgaJpZM4VbbHh.

mgrivich commented 5 years ago

On 7/24/2018 5:49 AM, Tristan Stenner wrote:

We already need to break the API for the next version,

We are not really breaking the API. We are fixing functions that have always been bugged. This doesn't mean that we should go crazy and change everything that is annoying us and end up with a Python 2 / Python 3 type split.

so we might as well think about other solutions, i.e. adding a function to drop the all samples except for the newest N.

Again, I don't see why this kind of functionality is desirable. Data loss is the worst, and adding functions to implement data loss seems to be asking for trouble. If the user really doesn't care about old data, they are free to pull it and not use it.

cboulay commented 5 years ago

One of the key features of LSL is data synchronization. Another key feature is easy inter-process communication. A target audience of people using LSL for inter-process communication are people that are doing neural signal processing and real-time feedback (brain-computer interfaces). Many of these users that I've helped recently, and me too when I first started using LSL, didn't realize that you are by default fetching the oldest data in the buffer, which can be quite old! If your inlet app has some initialization time between the time you create the inlet and the time you fetch a sample/chunk then you're starting out behind. If you aren't carefully watching your signal processing rate vs your sampling/chunk rate then you may never catch up. In other words, it's really easy to always be giving feedback of old data and there's nothing to tell you that this is happening. This is terrible pitfall for people that are doing neurofeedback, which is a common use-case for LSL. This isn't hypothetical. This has happened.

At the very least we need some very strong hints in the doc strings and examples about how to avoid this problem.

Setting a buffer to only 1 sample seems like a very bad idea. If LSL throws away data to make room for new data, I would call that a critical bug, not a feature that should be used.

But it is designed to do that. If this is a critical bug then LSL should error-out whenever there is a buffer overflow and then people like me won't be misinformed about how LSL should be used.

Right now, I'd say just fix the wrappers so that they don't allow the user to request not integer values for buffer.

In my Python example above, I did request an integer value for the buffer (=1) on the outlet end, but the sampling rate was < 1.0 so it still failed silently. If I set the buffer to 2 (so 1.8 seconds = 1 sample rounded down) then it works again, but if I set the inlet buffer to 1 second (still an int) then you get Stream transmission broke off (Input stream error.); re-connecting... which is at least something that a user can request help on.

mgrivich commented 5 years ago

A target audience of people using LSL for inter-process communication are people that are doing neural signal processing and real-time feedback (brain-computer interfaces). Many of these users that I've helped recently, and me too when I first started using LSL, didn't realize that you are by default fetching the oldest data in the buffer, which can be quite old! If your inlet app has some initialization time between the time you create the inlet and the time you fetch a sample/chunk then you're starting out behind. If you aren't carefully watching your signal processing rate vs your sampling/chunk rate then you may never catch up. In other words, it's really easy to always be giving feedback of old data and there's nothing to tell you that this is happening. This is terrible pitfall for people that are doing neurofeedback, which is a common use-case for LSL. This isn't hypothetical. This has happened.

The solution here is better documentation. Perhaps a FAQ and/or examples for people who care about real-time latency. If you can't pull till timestamp == 0, you haven't caught up. You can also check lsl_local_clock() vs the timestamps of incoming data to measure lag more explicitly.

In my Python example above, I did request an integer value for the buffer (=1) on the outlet end, but the sampling rate was < 1.0 so it still failed silently. If I set the buffer to 2 (so 1.8 seconds = 1 sample rounded down) then it works again, but if I set the inlet buffer to 1 second (still an int) then you get Stream transmission broke off (Input stream error.); re-connecting... which is at least something that a user can request help on.

I'd be fine with having the core library round up rather than down, so that this of error doesn't happen. When Christian wrote it, it appears that he didn't anticipate regular sampling rates of < 1 Hz.

tstenner commented 5 years ago

We are not really breaking the API. We are fixing functions that have always been bugged.

Which still crashes with combinations of programs compiled against the wrong library version so it's by definition an ABI break, since the types change the API changes as well even though most compilers will only warn about it.

I also like @tstenner's suggestion that we add a function to somehow take the newest N samples on demand if requested. This is tricky to implement, and I don't want to get in the habit of adding a whole bunch of patchy features to address many different special cases

With a queue implementation that has a queryable element count, it's (in pseudocode) a simple

queue.drop(std::max(0, queue.count() - sample_count));
return queue.pop_all();

This is in principle the same as pulling everything and discarding the older samples, but it leaves out potentially costly copies and conversions. I'd rather implement it as one drop_and_keep_newest_n_samples method so we don't get another 8 overloads for all pull functions.

Again, I don't see why this kind of functionality is desirable. Data loss is the worst, and adding functions to implement data loss seems to be asking for trouble. If the user really doesn't care about old data, they are free to pull it and not use it.

This requires several buffers to read the data, discard the oldest data and put the newest samples back together. Simple example: there are 13 samples available and you want the newest 6, so you have to pull 6 samples until there are no more samples available while rotating the current and last buffer, then copy the newest 1 sample together with the older 5 samples. It's a common use case so I'd prefer the more error proof "drop all except the latest 6 samples, pull 6 samples, done`.

dmedine commented 5 years ago

With a queue implementation that has a queryable element count, it's (in pseudocode) a simple

|queue.drop(std::max(0, queue.count() - sample_count)); return queue.pop_all(); |

This is in principle the same as pulling everything and discarding the older samples, but it leaves out potentially costly copies and conversions. I'd rather implement it as one |drop_and_keep_newest_n_samples| method so we don't get another 8 overloads for all pull functions.

Right, it's tricky to implement. LSL was designed assuming that no one would ever want to do this, so actually implementing something like this would require a lot of effort. It is not trivial to get a sample count field into the consumer_queue object.

This requires several buffers to read the data, discard the oldest data and put the newest samples back together. Simple example: there are 13 samples available and you want the newest 6, so you have to pull 6 samples until there are no more samples available while rotating the current and last buffer, then copy the newest 1 sample together with the older 5 samples. It's a common use case so I'd prefer the more error proof "drop all except the latest 6 samples, pull 6 samples, done`.

Providing a 'helper' method to do this for a user is actually what I had envisioned. In the end, though, I am still of the mind that this should be left to the user.

tstenner commented 5 years ago

Right, it's tricky to implement.

Not with the Boost Queue, see spsc_queue::read_available().

mgrivich commented 5 years ago

I would be surprised if the current implementation has significant performance overhead.

If we discover excessive copies being made, those should be replaced with pass-by-reference.