cameri / nostream

A Nostr Relay written in TypeScript
MIT License
713 stars 181 forks source link

Data encrypted at rest? #268

Open s3x-jay opened 1 year ago

s3x-jay commented 1 year ago

My host just about has nostream installed and running and this morning I started wondering if nostream is compliant with GDPR, California's privacy law, etc.

npubs are literally personally identifiable information. Events absolutely have personally identifiable information. The users table - well, that's about users so that's all personally identifiable information. The invoices table is tied to people so that also contains personally identifiable information. At the end of the day social media databases are the poster children for databases with personally identifiable information. Everything related to a person (nym or real name) has to be encrypted.

The problem is, unlike MySQL / MariaDB, PostgreSQL doesn't seem to do "encryption at rest".

https://www.postgresql.org/docs/current/encryption-options.html

If I understand it correctly the options are:

From what I'm seeing there's no equivalent of MySQL/MariaDB's encryption of tables/tablespaces.

SO… Are there any plans to support MySQL/MariaDB? I don't see how I (or anyone for that matter) can use nostream until it supports a database that can encrypt data at rest.

Sorry for being a bummer. But I've survived this long doing what I do because I fuss over details like this. (And my users trust me with truly sensitive information because I fuss over details like this).

s3x-jay commented 1 year ago

Going a bit further I just had "discussion" with ChatGPT about this where I asked "Does encrypting the drive partition satisfy GDPR and CCPA? Or is it unclear whether that satisfies the EU's GDPR and California's CCPA?" This was it's response…

Encryption of the drive partition can be an effective security measure to protect data at rest and may satisfy the requirements of GDPR and CCPA, depending on the specific circumstances.

Under GDPR, organizations are required to implement appropriate technical and organizational measures to ensure the security of personal data. Encryption is specifically mentioned as a recommended security measure in the GDPR. However, the regulation does not prescribe specific encryption methods or key management practices, so it is up to the organization to determine whether their chosen encryption strategy is appropriate for their specific data processing activities.

Similarly, under CCPA, businesses are required to implement reasonable security measures to protect the personal information they collect and process. Again, encryption is specifically mentioned as a recommended security measure in the CCPA.

Whether encryption of the drive partition is sufficient to meet these requirements depends on a variety of factors, including the sensitivity of the data being processed, the likelihood and potential impact of a data breach, and the overall security posture of the organization. In general, encrypting the drive partition can be an effective security measure, but it may not be sufficient on its own to meet all of the requirements of GDPR and CCPA.

Therefore, it's important to perform a thorough risk assessment and implement a comprehensive security program that includes multiple layers of security controls, including encryption, access controls, monitoring, and incident response.

The issue is the sheer size of Nostr database and how many of the queries will be based on a user's npub - which is "personally identifiable". Even MySQL/MariaDB would need to decrypt the entire table (and hold it all in RAM) to work with the data if you encrypt the InnoDB table/tablespace. That won't scale very well either.

I guess I'm just trying to get comfortable with the privacy implications that all of this creates. For example, with MariaDB/InnoDB the backups are all encrypted (but it won't scale well because of the implications of encrypting the table). With PostgreSQL special work needs to be done to encrypt backups of the database. And encrypting the drive partition isn't as good as encrypting the table (though it's faster/more scalable).

So I guess this comes down to what's reasonable? And do relay owners who don't encrypt their backups realize the risk they're taking? Users absolutely don't understand the risk until there's a data breach.

cameri commented 1 year ago

I'd recommend not using Nostr at all. The protocol is pretty much public.

s3x-jay commented 1 year ago

Responding to a governmental investigation with "The protocol is pretty much public" will basically confirm that you're violating GDPR and CCPA. The response needs to be "We carefully considered protecting the user's privacy and have put in as many controls to mitigate data breaches as we could." Doing everything you can is legally compliant. Ignoring privacy issues will get you in trouble.

I'll move forward with an encrypted drive partition and ensuring that all database backups are encrypted. As I think the issue through I think that's a reasonable level of protection given the downsides of the other alternatives. I'll also make it really clear as I try to onboard folks to Nostr that they need to pay attention to how relays they write to handle data privacy.

cameri commented 1 year ago

Responding to a governmental investigation with "The protocol is pretty much public" will basically confirm that you're violating GDPR and CCPA. The response needs to be "We carefully considered protecting the user's privacy and have put in as many controls to mitigate data breaches as we could." Doing everything you can is legally compliant. Ignoring privacy issues will get you in trouble.

I'll move forward with an encrypted drive partition and ensuring that all database backups are encrypted. As I think the issue through I think that's a reasonable level of protection given the downsides of the other alternatives. I'll also make it really clear as I try to onboard folks to Nostr that they need to pay attention to how relays they write to handle data privacy.

Please submit a PR with your changes to address those concerns for your specific jurisdiction. I'm neither in Europe nor in USA but you are welcome to contribute so you can run the software legally in your country.

s3x-jay commented 1 year ago

About the only thing I'd suggest you do is warn people who are installing nostream that they may want to consult a lawyer to determine how to comply with data privacy laws such as those in the EU and the US (which, technically, everyone with users in those jurisdictions is subject to).

My approach of encrypting the drive and making sure all backups are encrypted is something my server admins can do. It doesn't require changes in the code. I started this issue because my default way of doing data protection wasn't an option with PostgreSQL. Sorry if I was kinda thinking aloud above. It took that process for me to figure out what to do.