google / heir

A compiler for homomorphic encryption
https://heir.dev/
Apache License 2.0
289 stars 40 forks source link

Encoding "Taxonomy" #785

Open AlexanderViand-Intel opened 1 month ago

AlexanderViand-Intel commented 1 month ago

In order to avoid cluttering up the discussion of #762 (Enc/Dec and Randomness handling), I figured I'd create a new issue to (a) help myself get my thoughts straight on the various mappings/encodings and (b) get some feedback, especially with regards to the TFHE perspective on RLWE.

1. Application Data (e.g. i32) vs. "Cleartext" (e.g. $\mathbb{F}_p$) Semantics

Not sure if "Cleartext" is the right name here, but we usually have a disconnect between the semantics of the actual data type in the application (e.g., i32 or f32) and what we actually end up achieving under HE (usually finite field arithmetic with prime modulus). I know this isn't the case when we do bit-wise encryption (be it in TFHE or in BFV/BGV) but I'm not sure how 4-bit TFHE works here. Clearly, it's something to consider for "word-wise" FHE like BGV/BGV and especially CKKS, with its approximate nature.

Of course, for many applications, the difference between computation mod $2^{32}$ and a 32-bit prime is not important, but a compiler translation from i32 to mod p isn't technically "correct" and we should probably think about how to expose this to the developer/frontend. I guess this is conceptually similar to truncation and/or deciding approximation precision (e.g. LUT size) which are also relevant to TFHE, but in the mod p/polynomial approximation case we get with BFV/BGV and CKKS, "bits of precision" might not be sufficient as a metric/language to communicate the behavior.

2. Cleartext (e.g. $\mathbb{F}_p$) -> Plaintext Encoding (e.g., $R_p = \mathbb{Z}_p[X] / X^N + 1$)

For RLWE, the plaintext space is a polynomial ring, and we're generally much more interested in encoding integers (at this point, usually implicitly already considered mod p?). There are lots of ways to do this, starting with the trivial solution of just setting the constant coefficient to your integer and all other coefficients to zero, to decomposing your integer and putting a digit into each coefficient, to various forms of CRT-based "packing" or "batching" that allows one to achieve SIMD-style semantics. By far the most common is "full" packing, but you can also trade off a lower number of "slots" for larger (i.e., $\mathbb{F}_{p^d}$ for $d > 1$) slots. CKKS has some extra complexity here as the message space is technically a vector of complex numbers but of course it's approximate and in fixed-(ish)-point representation, so basically just more SIMD integer vector stuff.

Question: What does TFHE use here? The first RLWE "plaintext" I think of in the context of TFHE is the blind rotation test vector, and that looks like a trivial (non-CRT) "packing" of a vector of integers (with lots of repetition) into the coefficients of the polynomial?

3. Plaintext (e.g., $R_p = \mathbb{Z}_p[X] / X^N + 1$) -> Ciphertext (e.g., $R_q = \mathbb{Z}_p[X] / X^N + 1$) Encoding

In descriptions of BFV/BGV, this step is usually mixed in with encryption, but I've also seen this referred to as "MSB/LSB encoding", referring to how message and noise are arranged: image (Image from https://pqcrypto2023.umiacs.io/slides/2.2.pdf)

4. Representation

In addition to the above, we also need to consider how the polynomial itself is represented (e.g., coefficient-representation vs eval/NTT-form, RNS vs BigNum, etc) but that seems mostly orthogonal to the discussion above? In addition, most RLWE schemes actually don't juse use a single coefficient modulus but instead a "chain" (in RNS, the partial products of the RNS moduli usually make up the chain elements, in non-RNS, the chain could be built more freely but tends to be defined similarily). Note also, that the actual FHE scheme algorithms change depending on these representation choices (NTT stuff is pretty trivial, just iNTT if you need actual coefficients, but RNS versions are quite different and have different noise growth).

Situation in HEIR

Right now, we have the following attributes:

And then we have the lwe.rlwe_ciphertext type which relies on the above attributes to describe the ciphertext.


Question: How can we best express the required information for an, e.g. BGV ciphertext, specifically moduli chains and/or RNS?

j2kun commented 1 month ago

I want to reply in more detail later, but to try to help clear up any initial confusion, I'm convinced now that the encoding attributes I originally defined for RLWE are plain wrong.

In general, I consider "encoding" to be any and all preparation of the application data before encryption that does not require key material or ciphertext-specific random samples. I was mistaken in where that line was drawn for BGV/BFV when originally writing that.

On Wed, Jul 10, 2024, 1:57 PM Alexander Viand @.***> wrote:

In order to avoid cluttering up the discussion of #762 https://github.com/google/heir/issues/762 (Enc/Dec and Randomness handling), I figured I'd create a new issue to (a) help myself get my thoughts straight on the various mappings/encodings and (b) get some feedback, especially with regards to the TFHE perspective on RLWE.

  1. Application Data (e.g. i32) vs. "Cleartext" (e.g. $\mathbb{F}_p$) Semantics

Not sure if "Cleartext" is the right name here, but we usually have a disconnect between the semantics of the actual data type in the application (e.g., i32 or f32) and what we actually end up achieving under HE (usually finite field arithmetic with prime modulus). I know this isn't the case when we do bit-wise encryption (be it in TFHE or in BFV/BGV) but I'm not sure how 4-bit TFHE works here. Clearly, it's something to consider for "word-wise" FHE like BGV/BGV and especially CKKS, with its approximate nature.

Of course, for many applications, the difference between computation mod $2^{32}$ and a 32-bit prime is not important, but a compiler translation from i32 to mod p isn't technically "correct" and we should probably think about how to expose this to the developer/frontend. I guess this is conceptually similar to truncation and/or deciding approximation precision (e.g. LUT size) which are also relevant to TFHE, but in the mod p/polynomial approximation case we get with BFV/BGV and CKKS, "bits of precision" might not be sufficient as a metric/language to communicate the behavior.

  1. Cleartext (e.g. $\mathbb{F}_p$) -> Plaintext Encoding (e.g., $R_p = \mathbb{Z}_p[X] / X^N + 1$)

For RLWE, the plaintext space is a polynomial ring, and we're generally much more interested in encoding integers (at this point, usually implicitly already considered mod p?). There are lots of ways to do this, starting with the trivial solution of just setting the constant coefficient to your integer and all other coefficients to zero, to decomposing your integer and putting a digit into each coefficient, to various forms of CRT-based "packing" or "batching" that allows one to achieve SIMD-style semantics. By far the most common is "full" packing, but you can also trade off a lower number of "slots" for larger (i.e., $\mathbb{F}_{p^d}$ for $d > 1$) slots. CKKS has some extra complexity here as the message space is technically a vector of complex numbers but of course it's approximate and in fixed-(ish)-point representation, so basically just more SIMD integer vector stuff.

Question: What does TFHE use here? The first RLWE "plaintext" I think of in the context of TFHE is the blind rotation test vector, and that looks like a trivial (non-CRT) "packing" of a vector of integers (with lots of repetition) into the coefficients of the polynomial?

  1. Plaintext (e.g., $R_p = \mathbb{Z}_p[X] / X^N + 1$) -> Ciphertext (e.g., $R_q = \mathbb{Z}_p[X] / X^N + 1$) Encoding

In descriptions of BFV/BGV, this step is usually mixed in with encryption, but I've also seen this referred to as "MSB/LSB encoding", referring to how message and noise are arranged: image.png (view on web) https://github.com/google/heir/assets/88422715/12983074-150a-46a0-a0ae-59b5c1692321 (Image from https://pqcrypto2023.umiacs.io/slides/2.2.pdf)

  1. Representation

In addition to the above, we also need to consider how the polynomial itself is represented (e.g., coefficient-representation vs eval/NTT-form, RNS vs BigNum, etc) but that seems mostly orthogonal to the discussion above? In addition, most RLWE schemes actually don't juse use a single coefficient modulus but instead a "chain" (in RNS, the partial products of the RNS moduli usually make up the chain elements, in non-RNS, the chain could be built more freely but tends to be defined similarily). Note also, that the actual FHE scheme algorithms change depending on these representation choices (NTT stuff is pretty trivial, just iNTT if you need actual coefficients, but RNS versions are quite different and have different noise growth). Situation in HEIR

Right now, we have the following attributes:

-

lwe.bit_field_encoding which takes a start and width and describes where in a larger number of bits the "interesting" bits (i.e., message bits) are. This seems to be capturing Step 3, though I'm not sure how we'd express CKKS here? Maybe that's what lwe.unspecified_bit_field_encoding is for? Actually, I'm not even sure how to capture BFV/BGV with this, as they can (iirc) scale their noise or message by non-power-of-two numbers, so the "bits" idea breaks down a bit?

lwe.polynomial_coefficient_encoding which also has the start/width parameters and basically seems to indicate the "trivial coefficient packing" as used in TFHE's blind rotate? (i.e., Step 2)? Similarily, we have lwe.polynomial_evaluation_encoding and lwe.inverse_canonical_embedding_encoding (which actually already has a TODO comment wondering about how bitfield would work here, see #183 https://github.com/google/heir/issues/183)

polynomial.ring which encode things relevant to 4, but don't really give us a way to encode things such as as the modulus chain (which is needed independently of RNS).

And then we have the lwe.rlwe_ciphertext type which relies on the above attributes to describe the ciphertext.

Question: How can we best express the required information for an, e.g. BGV ciphertext, specifically moduli chains and/or RNS?

— Reply to this email directly, view it on GitHub https://github.com/google/heir/issues/785, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAS2PKW2JJSEY6NLUPMQSA3ZLWN2FAVCNFSM6AAAAABKVVMILSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQYDCNZRGU2DAMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

AlexanderViand-Intel commented 1 month ago

Let me also add a list of information I think we need from a BGV/BFV ciphertext (assuming we follow Kim et al. and do a unified implementation) for the bgv->lwe/poly lowering:

Later on (either lwe->poly or --convert-polnomial-mul-to-ntt), we'll also need the current polynomial ring, in order to generate the metadata for the NTT/iNTT. This would also (I think?) be the right point to add montgomery representation?

The other steps (1/2) seem mostly important for the secret -> fhe scheme step, where they're necessary decide what schemes are applicable and when/how to do noise management.

AlexanderViand-Intel commented 1 month ago

I've played around a bit with this now, and come up with the following design proposal:

The lwe.rlwe_ciphertext no longer has a single lwe.rlwe_params attribute, but instead a series of (default/optional) attributes:


Alternativel/in addition to rlwe_ciphertext, we could also define a glwe_ciphertext that can express any kind of GLWE ciphertext, including LWE and RLWE ciphertexts. This would have most of the attributes from above, but also

[^1]: Tensor uses a pretty complex system with some kind of Shape_ShapeType, but there's also "ArrayRef<int64_t>":$shape used by memrefs which might be

j2kun commented 1 month ago

So I haven't had time to really process all of the above and make concrete suggestions, but I will say that, in my opinion, (1) and (2) in https://github.com/google/heir/issues/785#issue-2401715400 are both "encoding" and (3) and (4) are not. In particular key information is not part of encoding.

Encoding is just "how do you make your application data suitable to be encrypted" and everything else about the cryptosystem is encryption. As such, I think the current RLWEPlaintext type has everything it needs: the ring (plaintext space), the encoding method (includes packing method, RNS choices, coefficient/NTT domain), and the underlying type.

Nb., the underlying type can be a single scalar value, if the encoding method splits that into digits and packs the digits. Or it could be a tensor of small scalars, if those are directly packed.

The ciphertext should have all the same data, along with anything else needed to use the scheme (moduli chain, "MSB/LSB" scaling factors (I don't want to call this encoding), mod switch info, etc. And I imagine they would all be optional attributes since the specific schemes can use them or not depending on the variants. We might do ourselves a favor by augmenting this with some helper methods that tie the variants to specific checks on the attributes (e.g., satisfiesPaperXYZ2021 which checks for the existence of three attributes and appropriate values for them), and we can use those as guards for various lowerings.

j2kun commented 1 month ago

@AlexanderViand-Intel I suspect there is a particular part of this encoding business that is the most uncertain. Can you decide on which and we can iron that out first? Is it the RNS/BigNum stuff?

AlexanderViand-Intel commented 1 month ago

@AlexanderViand-Intel I suspect there is a particular part of this encoding business that is the most uncertain. Can you decide on which and we can iron that out first? Is it the RNS/BigNum stuff?

I think most of this was just from naming/terminology confusion, it now seems relatively straightforward to separate.

I will say that, in my opinion, (1) and (2) in #785 (comment) are both "encoding" and (3) and (4) are not. In particular key information is not part of encoding.

Encoding is just "how do you make your application data suitable to be encrypted" and everything else about the cryptosystem is encryption.

I think I agree with this clasification~ Any suggestions for an alternative name for (3) and (4)?

asraa commented 1 month ago

https://gist.github.com/asraa/6105c144aa10295aa4f021438d5b046d

I started a small scratch doc that I'll convert into a draft PR for commenting purposes..

BTW:

Maybe not necesary, unless there are situations where dimension doesn't uniquely determine a key_basis

yes! a galois rotation doesn't change the dimension but changes the key basis into something like (1, s^i) where it's a rotation by i.

I'm suggesting encryption info attributes for (3) - to me this stuff is relevant to the encryption process / step (computing how to place the Z_p plaintext into the Z_q ciphertext during encryption).

j2kun commented 1 month ago

+1, the only murky part is the parameters that are not specific to encryption per se, but to the overall homomorphicness/performance of the cryptosystem. But I think "EncryptionParams" or similar is sufficient.