sccn / liblsl

C++ lsl library for multi-modal time-synched data transmission over the local network
Other
107 stars 63 forks source link

Data integrity and privacy #130

Open xloem opened 3 years ago

xloem commented 3 years ago

As LSL becomes more commonplace, integrity and privacy become increasingly important.

LSL could be secured better by producing a private key for every data producer, and authenticating the data with the private key.

https://datproject.org/ devised a network streaming protocol that provides for the authentication of subregions of data by building an ever-expanding merkle tree of the data as it is produced. Just an idea: dat hasn't been ported to c++ but the underlying protocol isn't complex. The design is modular so it's not hard to port just the parts of interest.

To protect from data compromise, it would be good if all data were authenticated by the devices producing it and encrypted node-to-node only to network connections authorised to receive it.

I'm interested in doing work on this, but I wouldn't complete it soon on my own.

I'm not familiar with all the existing solutions here. I considered TLS but didn't see a way to check data integrity after decryption.

tstenner commented 3 years ago

From a security perspective, LSL assumes that all devices on the local network are trustworthy, either because the network is locked down (e.g. wired networks in clinical settings) or untrustworthy hosts are blocked by a firewall.

There was an issue in die archived repository (https://github.com/sccn/lsl_archived/issues/246) that proposed adding SSL-encrypted connections. It's possible to implement, but as a (mostly backwards incompatible) addition to the core protocol it would need to be vetted thoroughly.

IMHO, LSL sits at a point in the pipeline (device -> vendor SDK -> liblsl -> network -> liblsl -> receiver) where the only sensible assurance would be that the pulled data is the same as the data that got pushed to the corresponding inlet. With encrypted connections without HMACs, it'd be possible to alter the encrypted data (which would be very noticable), drop samples or repeat samples.

Authentication could be done by either by sharing an encryption key between outlets and inlets (i.e. the same between all inlets) or configuring the inlets' public keys in the outlet (i.e. only allowed inlets can connect).

xloem commented 3 years ago

Since this hasn't been implemented yet, I think it would be great to plan for providing non-repudiation of data fed into LSL. This just means signing the data itself with a per-device private key.

It doesn't seem likely to me that a local network is trustworthy nowadays. I imagine assuming that outlets are insecure (users, diverse networks), but devices that generate data are securable (speaking only lsl and nothing else).

SSL/TLS provides network privacy and authentication which is very important. Adding non-repudiation would mean also signing the unencrypted data.

tstenner commented 3 years ago

This just means signing the data itself with a per-device private key.

Then the signing would have to happen on the device. Even with closed source connectors it's just a matter of time before someone attaches a debugger and gets the key. Or swaps out liblsl and replaces the void push_sample(float* data, key_struct* key) with an implementation that like void push_sample(float* data, key_struct* key) { std::cout << key << std::endl; } I don't know of any data acquisition program that signs the data and they are in a far better position (concerning the parts of the stack they control).

It doesn't seem likely to me that a local network is trustworthy nowadays.

Cue @dmedine shouting "Amen!" :wink: I agree, but there's limits to that LSL can do without sacrificing other goals (i.e. configurationless data exchange).

SSL/TLS provides network privacy and authentication which is very important.

It can, but in order to reach your goals it would need either a CA infrastructure or an administrator that takes care of setting up all nodes with keys.

Adding non-repudiation would mean also signing the unencrypted data.

Assuming that the inlet has somehow established an encrypted connection to the device and proper HMAC checks are done, this wouldn't add anything unless the private key of the inlet is really private.

xloem commented 3 years ago

On Thu, Jun 24, 2021, 10:48 AM Tristan Stenner @.***> wrote:

This just means signing the data itself with a per-device private key.

Then the signing would have to happen on the device. Even with closed source

Correct. That would mean running lsl on a device or a secure system using them.

connectors it's just a matter of time before someone attaches a debugger and gets the

Yes, but for example with a medical device things will be much more secure, and many areas have physical security.

but there's limits to that LSL can do without sacrificing other goals (i.e. configurationless data exchange).

I'm sure we can find ways to preserve the existing goals.

It can, but in order to reach your goals it would need either a CA infrastructure or an administrator that takes care of setting up all nodes with keys.

I would leave it up to device vendors to set up some form of CA but expand the infrastructure to provide for it.

this wouldn't add anything unless the private key of the inlet is really private.

Right.