microsoft / SEAL

Microsoft SEAL is an easy-to-use and powerful homomorphic encryption library.
https://www.microsoft.com/en-us/research/group/cryptography-research/
MIT License
3.52k stars 703 forks source link

Question: Questions regarding the BatchEncoder and it's plain_modulus_bitsize #356

Closed CleanHit closed 3 years ago

CleanHit commented 3 years ago

I'm trying to decide on a "well good enough" plain_modulus_bitsize to use. I saw this example in one of the issues but if I don't have any a priori knowledge about the computation depth what would be a relative commonly good plain_modulus_bitsize value to start with?

  1. What kind of increment strategy would you recommend for setting plain_modulus_bitsize to avoid or counter overflows during computation?
  2. I'll have to re-encode and re-encrypt all my batched values when changing plain_modulus_bitsize (or poly_modulus_degree or security_level) since those define the context that is used in other SEAL components declaration, correct?
  3. Are all the matrices have 2-by-(N/2) size and the smallest one is 2-by-512?
  4. Lets say I want to add some values in a 2-by-512 matrix but I don't to ignore some slots for the calculation. One way would be to create a 2-by-512 filter matrix that sets all the undesired slots to 0 for that specific calculation, correct or is there a different (more efficient) way to execute computations on some proton of slots in an encrypted batch matrix?
WeiDaiWD commented 3 years ago
  1. Any program can be viewed as a mathematical function with domain (input) and range (output). Since you know the domain and function, you should know its range.
  2. plain_modulus is a member of EncryptionParameters which is used to create SEALContext. So yes, you have to change all SEAL objected created with that parameter or context. You don't have to change other values in EncryptionParameters.
  3. Right.
  4. I do not understand what you said. If you want to add some slots in the same ciphertext, you can rotate and add.
CleanHit commented 3 years ago
4\. I do not understand what you said. If you want to add some slots in the same ciphertext, you can rotate and add.

I'm assuming cell wise mathematical operations. Lets say I have [ 1, 2, 3, 4, 5, 6] as a 2-by-3 matrix and I want to sum the numbers 1 + 3 + 5 or any numbers in those slots. What I meant was using a [ 0, 1 ] filter matrix to set the irrelevant slots values to 0 like that:

[ 1,  2,  3 ]
[ 4,  5,  6 ]

X

[ 1,  0,  1 ]
[ 0,  1,  0 ]

And calculate sum on the result matrix. I guess this approach will result in noise growth that dependents on the number of multiplications involved and I'm better off using rotations. I didn't check rotations so far, thank you for your answer.

CleanHit commented 3 years ago

After playing a bit with the current library I didn't see any option to calculate the sum or multiplication of all matrix elements using the evaluator(). Is it supported in SEAL for BFV or CKKS or do I have to decrypt the result matrix and then do some statistical calculation? I'm asking because I could achieve this goal on ciphertexts using SEAL v.2.3.1, at least for calculating mean, variance and parts of std. deviation except from nth root at the end that had to be done after decrypt.

In my case scenario I need to be able to operate on Ciphertexts created from one single number. Does SEAL still offers encryption of single values into ciphertexts using BFV or CKKS or do I always have to create a matrix holding one single value using the BatchIncoder?

WeiDaiWD commented 3 years ago

Is it supported in SEAL for BFV or CKKS or do I have to decrypt the result matrix and then do some statistical calculation?

You can rotate the vector/matrix to sum up all elements to the first slot. The old API in SEAL v2.3.1 was removed. You have to implement it by yourself.

In my case scenario I need to be able to operate on Ciphertexts created from one single number.

There are two ways:

  1. with BatchEncoder: either populate the whole matrix with the same value, or set all but one slot to zero;
  2. without BatchEncoder: set Plaintext to a constant polynomial equal to the value.
CleanHit commented 3 years ago

Thank you for your suggestions.

You can rotate the vector/matrix to sum up all elements to the first slot. The old API in SEAL v2.3.1 was removed. You have to implement it by yourself.

Rotation is not the an issue but I don't think I can specify a slot number for addition when working on Ciphertexts e.g. evaluator.add_inplace(encrypted_sum_matrix[0] + encrypted_matirx[0]) then rotate encrypted_matirx2 row + column wise and repeat the evaluator.add_inplace() as much as needed. I think that I always must decrypt and decode the matrices to be able to do something like that encrypted_sum_matrix[0] + encrypted_matirx[0].

What I tried was this:

This approach doesn't seem to be effective. I took me almost 20 minutes to sum up a matrix in it's encrypted form on my PC (Ryzen 4800HS and 40GB RAM) with those encryption settings:

/
| Encryption parameters :
|   scheme: BFV
|   poly_modulus_degree: 8192
|   coeff_modulus size: 218 (43 + 43 + 44 + 44 + 44) bits
|   plain_modulus: 1032193
\

Does 20 minutes considered to be too slow? Is there a better way to do it on an encrypted matrix?

There are two ways:

1. with `BatchEncoder`: either populate the whole matrix with the same value, or set all but one slot to zero;

2. without `BatchEncoder`: set `Plaintext` to a constant polynomial equal to the value.

Yes I've already went with the 1. option by setting all but one slot to zero.

WeiDaiWD commented 3 years ago

I think that I always must decrypt and decode the matrices to be able to do something like that encrypted_sum_matrix[0] + encrypted_matirx[0].

I don't understand what you were trying to say here.

20 min is too slow.

CleanHit commented 3 years ago

I think that I always must decrypt and decode the matrices to be able to do something like that encrypted_sum_matrix[0] + encrypted_matirx[0].

I don't understand what you were trying to say here.

Thank you for your response but I've changed my approach so none of it is relevant anymore.

20 min is too slow.

I'll try to explain my use case. In my use case I have an encrypted batch of slots (lets say 8192 slots) of e.g. confidential salary information and I have some mapping between a person to his salary in a specific encrypted slot in the batch. Now lets say I want to calculate the a sum of salaries of a subset {x, ... ,n}, but I want to be able to:

  1. Perform calculations on a subset of people's salary
  2. Perform it on all the the salaries

Case 1: since I always know the mapping I came up with a way to define a subset and calculate it's sum using only row rotations. The time required for the calculation depends the size of my subset that defines the number of rotations and additions for its sum calculation. For calculating the sum for half of the batched entries takes in most cases ~10 minutes, depends on the mapping subset used. I might be able to update my approach for creating a subset. Case 2: If I calculate it for all the salaries then for a batch of 8192 slots I'll need to perform 4095 rotations and 4096 additions and this takes ~20 minutes.

Do you think there is a way to achieve a better performance in terms of calculation time for case 2? Without using multi threading.

Pro7ech commented 3 years ago

Following what you described in your last post, for the first case you can first mask the values you don't want (with a binary vector) and then do an inner sum over the vector with log(slots) rotations. For the second it's the same as the first case, but without that binary vector.

CleanHit commented 3 years ago

@Pro7ech if I understood you correctly then that is what I was doing. I've created a binary filter matrix to filter out the not needed values and then did the sum calculation through rotations. I've then realized that instead of filtering the not needed values from the matrix I can create a rotation plan and use it for the calculation. Nevertheless the execution times are still as mentioned above, therefore I'm asking if there is anything in the library that I can use to reduce that.

WeiDaiWD commented 3 years ago

Based on the parameters provided, 8192, ciphertext multiplication and rotation both take roughly 11 ms on my PC, which is not much more powerful from your PC. Then 20 min translates to 100,000 rotations. In fact, to sum up all slots in a ciphertext, you need 13 rotations in total, which takes less than 150 ms. Please check first whether you have compiled SEAL in debug mode, then count how many rotations or other ciphertext operations are measured.

CleanHit commented 3 years ago

@WeiDaiWD thank you for your response, I was indeed compiling in debug mode :man_facepalming:. After changing it the calculation is much much faster ~19 seconds in my case for a full matrix. Could you please elaborate on how did you manage to sum up all the slots in 13 rotations? I might have misunderstood on how to use the BatchEncoder for that purpose.

WeiDaiWD commented 3 years ago

There is an explanation (using a vector instead of a matrix) in this issue. You only need to rotate by power-of-2 slots, bringing this summing algorithm from O(n) to O(log n) complexity.

CleanHit commented 3 years ago

@WeiDaiWD Thank you for the link. If I got it right then auto slot_count = batch_encoder.slot_count() returns the value set in std::size_t poly_modulus_degree and it's possible values are 1024, 2048, 4096, 8192, 16384, 32768, ... then if I want to compute the sum of all the values I need to do log_2(slot_count) rotations for a 1-by-(slot_count) vector and log_2(slot_count / 2) rotations for a 2-by-(slot_count / 2) matrix. When using the latter the result will be in e.g. slots 0 + (slot_count / 2), 1 + (slot_count / 2) +1 and etc.. Using the latter the sum calculation of the matrix takes ~66 - 68 ms.

WeiDaiWD commented 3 years ago

For 2-by-(slot_count / 2) matrix, you can do log_2(slot_count / 2) row rotation and 1 column rotation, which save one rotation from your second method. Anyway, 66~68 ms seems reasonable.

CleanHit commented 3 years ago

@WeiDaiWD is 60 the maximum possible value for plain_modulus_bitsize? It is an int but I've noticed that I get invalid argument error: bit_sizes is invalid for values over 60.