cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.97k stars 3.79k forks source link

sql/builtins: evaluate PGP Encryption Functions from pgcrypto #73869

Open rafiss opened 2 years ago

rafiss commented 2 years ago

NB: let's first evaluate how valuable this would be, since this seems like a larger project.

From https://www.postgresql.org/docs/14/pgcrypto.html

pgp_sym_encrypt(data text, psw text [, options text ]) returns bytea
pgp_sym_encrypt_bytea(data bytea, psw text [, options text ]) returns bytea
pgp_sym_decrypt(msg bytea, psw text [, options text ]) returns text
pgp_sym_decrypt_bytea(msg bytea, psw text [, options text ]) returns bytea
pgp_pub_encrypt(data text, key bytea [, options text ]) returns bytea
pgp_pub_encrypt_bytea(data bytea, key bytea [, options text ]) returns bytea
pgp_pub_decrypt(msg bytea, key bytea [, psw text [, options text ]]) returns text
pgp_pub_decrypt_bytea(msg bytea, key bytea [, psw text [, options text ]]) returns bytea
pgp_key_id(bytea) returns text

armor(data bytea [ , keys text[], values text[] ]) returns text
dearmor(data text) returns bytea
pgp_armor_headers(data text, key out text, value out text) returns setof record
pgp_sym_encrypt(data, psw, 'compress-algo=1, cipher-algo=aes256')
The functions here implement the encryption part of the OpenPGP (RFC 4880) standard. Supported are both symmetric-key and public-key encryption.

An encrypted PGP message consists of 2 parts, or packets:

    Packet containing a session key — either symmetric-key or public-key encrypted.

    Packet containing data encrypted with the session key.

When encrypting with a symmetric key (i.e., a password):

    The given password is hashed using a String2Key (S2K) algorithm. This is rather similar to crypt() algorithms — purposefully slow and with random salt — but it produces a full-length binary key.

    If a separate session key is requested, a new random key will be generated. Otherwise the S2K key will be used directly as the session key.

    If the S2K key is to be used directly, then only S2K settings will be put into the session key packet. Otherwise the session key will be encrypted with the S2K key and put into the session key packet.

When encrypting with a public key:

    A new random session key is generated.

    It is encrypted using the public key and put into the session key packet.

In either case the data to be encrypted is processed as follows:

    Optional data-manipulation: compression, conversion to UTF-8, and/or conversion of line-endings.

    The data is prefixed with a block of random bytes. This is equivalent to using a random IV.

    A SHA1 hash of the random prefix and data is appended.

    All this is encrypted with the session key and placed in the data packet.

Jira issue: CRDB-11792

rbygrave commented 2 years ago

Note: Ebean ORM supports use of pgp_sym_encrypt and pgp_sym_decrypt for transparent runtime encryption/decryption of specific columns. That is, Ebean makes the fact some columns are encrypted is transparent to application code. This feature is typically used in apps that need HIPPA / PIM personally identifying data (name, date of birth etc) to be only decrypted on access at runtime. A workaround for apps porting to Cockroach would be to use client side encryption/decryption noting that has limitations (e.g. can't use the encrypted columns in a query WHERE clause).

polikeiji commented 1 year ago

Hello! Let me clarify if those functions have already been supported by CockroachDB or not. #21001 has been closed as completed, but hasn't the pgcrypto module been actually supported by CockroachDB yet?