edgedb / edgedb

A graph-relational database with declarative schema, built-in migration system, and a next-generation query language
https://edgedb.com
Apache License 2.0
12.8k stars 394 forks source link

Data encryption support #6730

Open iron3oxide opened 5 months ago

iron3oxide commented 5 months ago

As a developer handling PII, it would be great to be able to store certain fields or even complete types encrypted. I realize that if the field type information ought to be retained, this feature may be a bit complicated to add. Nonetheless, I think it is vitally important for a modern DBMS.

elprans commented 5 months ago

We should be able to expose basic encryption capabilities via the pgcrypto extension.

iron3oxide commented 5 months ago

Sounds good! I imagine one would be able to dictate the output type of a decrypted value with a cast?

emrosenf commented 5 months ago

@elprans Currently only one-way hashing methods are exposed. It would be useful to expose AES encrypt/decrypt

CodesInChaos commented 5 months ago

Under which circumstances do you want to encrypt/decrypt data in the database, instead of the application?

Also, the functions descriptions at F.28. pgcrypto — cryptographic functions are very difficult to use and largely rely on obsolete cryptography.

That's also a bit of a problem for the already integrated functions (for example 3 of the 4 algorithms supported by pgcrypto::crypt should never be used). But encryption is far trickier than hashing, so the dangers will only become bigger there.

iron3oxide commented 5 months ago

@CodesInChaos you certainly have a point there. I guess the main reason would be to be able to maintain the schema/type information that makes EdgeDB so great. Unless I am missing something, this would mean that e.g. the codegen feature would be somewhat useless if one uses encryption and migrations would be less helpful.

CodesInChaos commented 5 months ago

The ciphertext will always be bytes, and the plaintext will either be a string, or bytes as well. So there isn't much type safety to be had here.

I'd define a custom scalar type wrapping bytes and use that in my data model. Something like:

scalar type Ciphertext extending bytes;

In the long run it would probably make sense to add support for transparent encryption inside the driver, but that's a relatively complex feature.

iron3oxide commented 5 months ago

I can definitely think of examples where the plaintext would be a date, int or bool (even though the latter would admittedly be hard to encrypt on its own). Agreed on the TDE though, that would be a game changer. Maybe it's easier to implement once Postgres supports it?