Receiving different results for the same (encrypted) calculation on different runs (CKKS Scheme)

renis1235 commented 3 years ago

Performing several calculations with the same values results in different outcomes each time

Hello. We are implementing a vector * matrix multiplication using the Linear Transformation algorithm. I will try to avoid the details of this algorithm and how it works (if you want to take a look click here). For testing purposes we are using a matrix filled with double values and all of them are 1s. We are multiplying this with a vector which is also filled only with ones. These are the parameters that we are using:

poly_modulus_degree = 8192 coeff_modulus = CoeffModulus::Create(poly_modulus_degree, { 60, 40, 40, 60 }) scale = pow(2.0, 40);

We are using the C++ version of the Library, and Visual Studio as the IDE.

This is the code snippet that performs the multiplication:

inline Ciphertext Linear_Transform_Cipher(Ciphertext ct, vector<Ciphertext> U_diagonals, GaloisKeys gal_keys, EncryptionParameters params, bool rescale)
{
    // ct is the vector, U_diagonals are the diagonals of the matrix, the other ones are self explanatory
    cout << "    + Invoked Linear_Transform_Cipher C_vec . C_mat:" << endl;
    auto context = SEALContext::SEALContext(params);
    Evaluator evaluator(context);

    Ciphertext ct_rot;

    evaluator.rotate_vector(ct, -U_diagonals.size(), gal_keys, ct_rot);    

    Ciphertext ct_new;
    evaluator.add(ct, ct_rot, ct_new);

    cout << "    + Scale add: " << log2(ct_new.scale()) << " bits" << endl;

    vector<Ciphertext> ct_result(U_diagonals.size());

    evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]);     

    for (int l = 1; l < U_diagonals.size(); l++)
    {
        Ciphertext temp_rot;
        evaluator.rotate_vector(ct_new, l, gal_keys, temp_rot);
        evaluator.multiply(temp_rot, U_diagonals[l], ct_result[l]);    
    } 

    Ciphertext ct_prime;
    evaluator.add_many(ct_result, ct_prime);
    if (rescale)
    {
        cout << "    + Scale Linear_Transform_Cipher addmany: " << log2(ct_prime.scale()) << " bits" << endl;        
        evaluator.rescale_to_next_inplace(ct_prime);
        cout << "    + after rescale Linear_Transform_Cipher addmany: " << log2(ct_prime.scale()) << " bits" << endl;
    }    

    return ct_prime;
}

So far so good, but running this snippet above with exactly the same data, produces different results. The way we have tested this is by running the algorithm with plain data and in the "traditional way" (thus getting the real and expected results) and substracting the expected result with the decrypted & decoded result of the encrypted calc.

Below you may see the plots that show how different the results are (with different matrix & vector dimensions) for different runs. (with rescaling and without).

MVMultiplication_OnesMatrix_WithoutRescaling

MVMultiplication_OnesMatrix_WithRescaling

Another thing to be noted is that rescaling actually makes things worse. It may be that we are using the library wrong, or it may be that there is a random piece of code in the CKKS scheme.

Also another thing, when we try to do several calculations in a row, of the type:

encrypted_result_1 = matrix vector encrypted_result_2 = matrix encrypted_result_1 encrypted_result_3 = matrix * encrypted_result_2

And so on... But doing this, we either receive wrong results through rescaling...or we get an Exception thrown at us... What would be the best way to solve this? Without decrypting, re-encrypting and then doing the calculations...

Any help will be greatly appreciated. The code that we are using can be found in our repo.

WeiDaiWD commented 3 years ago

I see that you did perform "relinearization" after ciphertext-ciphertext multiplication. Could you please try inserting 'evaluator.relinearize_inplace(ct_prime, relin_keys);' after evaluator.add_many(ct_result, ct_prime);? It requires you to add a new argument to the function, RelinKeys relin_keys. You can get it in a similar way to GaloisKeys gal_keys.

If your test takes the same input messages / plaintext but perform fresh encryption of them each time, it is true that multiple runs with the same input will have different error values, since the error comes from randomness introduced in encryption. Still, error should be similar in scale. When you skipped relinearization, it forces the decryptor to decrypt ciphertexts in noisier way, which may be the cause of spikes in your graphs.

Please share with me your findings or modified code if you are not certain. I'm happy to help you analyze this. It's interesting work!

najmabegum commented 3 years ago

Hi,

I and Renis (the one who opened the issue), work in the same team. I made the change as you suggested and added evaluator.relinearize_inplace(ct_prime, relin_keys);' after evaluator.add_many(ct_result, ct_prime); I repeated the runs for a matrix of dimension 100 * 100 (all entries are 1), and vector entries are 1 too. The results for the maximum error norm are as follows:

Run 1- 3.83521e-05 Run 2- 7.14077e-05 Run 3- 2.94935e-05 Run 4- 1.73372e-05 Run 5- 2.31203e-05

WeiDaiWD commented 3 years ago

I don't think 2~3-bit variance in error is abnormal. But I do notice that after evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]); there is no relinearization or rescale. Relinearization should definitely be inserted here. Sorry that I missed this at the first look. This is the reason why

encrypted_result_1 = matrix * vector
encrypted_result_2 = matrix * encrypted_result_1
encrypted_result_3 = matrix * encrypted_result_2

fails with wrong results.

What are the scales used to encode/encrypt each ciphertext? Is the scale stabilized at roughly 2^40 after each "ciphertext-ciphertext multiplication + reliearization + rescale" layer?

If the scale is not stabilized, you can also get error "scale out of bound" when computing `encrypted_result_2/_3", or get wrong results. Inserting a rescale here could also have helped to make the error smaller. Note that you might have to add another 40-bit prime to the encryption parameters to insert an additional rescale.

renis1235 commented 3 years ago

First of all, thank you for taking the time to check our work and thank you for valuing it. We are also thankful for having SEAL provided to us for research purposes. @WeiDaiWD

These are the error terms after adding the relinearization after evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]); evaluator.relinearize_inplace(ct_result[0], relinKeys);

dimension 100*100:

Run 1 3.29064e-05
Run 2 1.08903e-05
Run 3 5.86179e-05
Run 4 0.00010019
Run 5 2.72579e-05

dimension 700*700:

Run 1 0.000393565
Run 2 0.00033149
Run 3 0.000135515
Run 4 9.07438e-05
Run 5 0.000378668

The scale used is as mentioned above:

scale = pow(2.0, 40);

What do you mean by scale stabilizing? The Exception that we get is a Memory Exception and it is thrown on the line: evaluator.rotate_vector(ct, -U_diagonals.size(), gal_keys, ct_rot);. Which is logical since it is the first line of the function which does an operation... And you are absolutely correct, after the second iteration, the results are way off... Could you please be more detailed on where we should rescale to make the error smaller and would adding just one 40-bit prime to the parameters be enough?

WeiDaiWD commented 3 years ago

When you multiply two ciphertexts or a ciphertext with a plaintext, SEAL also multiply their scales. For example, ct_a (with scale 2^40) * ct_b (with scale 2^40) will yield ct_c (with scale 2^80). Then if you perform rescale on ct_c, assuming that a 40-bit prime will be consumed at this level, ct_c_new will have scale close to 2^40. You would want to keep all intermediate ciphertexts have roughly the same scale with the final ciphertext result for decryption, which is stabilizing.

I added some comments in your code. This should stabilize the scale and should ensure that ct_prime has 40-bit scale eventually. If you missed one of the two rescales, it will result in different scales in ct_result[i] and causes a run-time error thrown by evaluator.add_many.

inline Ciphertext Linear_Transform_Cipher(Ciphertext ct, vector<Ciphertext> U_diagonals, GaloisKeys gal_keys, EncryptionParameters params, bool rescale)
{
    /* all ciphertexts have scale 2^40 */
    // ct is the vector, U_diagonals are the diagonals of the matrix, the other ones are self explanatory
    cout << "    + Invoked Linear_Transform_Cipher C_vec . C_mat:" << endl;
    auto context = SEALContext::SEALContext(params);
    Evaluator evaluator(context);
    Ciphertext ct_rot;
    evaluator.rotate_vector(ct, -U_diagonals.size(), gal_keys, ct_rot);    
    Ciphertext ct_new;
    evaluator.add(ct, ct_rot, ct_new);
    cout << "    + Scale add: " << log2(ct_new.scale()) << " bits" << endl;
    /* this prints 40, right> */

    vector<Ciphertext> ct_result(U_diagonals.size());
    evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]);
    ////// insert relin here
    /* ct_result[0] has scale 80 */
    ////// insert rescale here
    /* ct_result[0] now has scale 40 */

    for (int l = 1; l < U_diagonals.size(); l++)
    {
        Ciphertext temp_rot;
        evaluator.rotate_vector(ct_new, l, gal_keys, temp_rot);
        evaluator.multiply(temp_rot, U_diagonals[l], ct_result[l]);    
    }
    Ciphertext ct_prime;
    evaluator.add_many(ct_result, ct_prime);
    ////// insert relin here
    if (rescale)
    {
        cout << "    + Scale Linear_Transform_Cipher addmany: " << log2(ct_prime.scale()) << " bits" << endl;
        /* this prints 80, right? */
        evaluator.rescale_to_next_inplace(ct_prime);
        cout << "    + after rescale Linear_Transform_Cipher addmany: " << log2(ct_prime.scale()) << " bits" << endl;
        /* this prints 40, right? */
    }    

    return ct_prime;
}

The number of primes that you need is:

at least a big one for decryption, in your case a 60-bit prime;
the same number of primes as the number of rescales to be performed: if you have two multiplications / rescales in the function and you call the function once, 2 primes; if you call the function twice, 4 primes; and so on;
a prime at the end that is as large as the largest prime, in your case a 60-bit prime,

The function Linear_Transform_Cipher only has one layer of multiplication, a.k.a. one rescale, so encrypted_result_1 = matrix * vector consumes one prime, the latter 40-bit prime in your parameters. Then encrypted_result_2 = matrix * encrypted_result_1 consumes another prime, the former 40-bit prime in your parameters. Now encrypted_result_3 = matrix * encrypted_result_2 has to consume the only prime left (the starting 60-bit prime) to rescale, that will cause an error. So inserting a 40-bit prime can help here. Still, I don't see why an memory exception is thrown in rotation.

I think the size and variance of error you see is normal. The fact that they grow bigger as dimension is larger is probably because of evaluator.add_many where error from many ciphertexts are accumulated. Even if error in each ciphertext can be seen as independently sampled from a zero-centered distribution, summing many of them will cause a larger variance in the final distribution. And these ciphertexts might not be independent.

Rescale also does not help to reduce the error. It is used to stabilize scales along the computation. It does not alter the distance between error and decimal point in messages. With your parameters, and if you only performs encrypted_result_1 = matrix * vector, you are wasting one prime. Assume that you have all relinearization correctly inserted, without rescaling you should end up with 80-bit scale in encrypted_result_1 to be decrypted with {60, 40, 40} primes, with rescaling you should end up with 40-bit scale in encrypted_result_1 to be decrypted with {60, 40} primes, both of which should have similarly sized error. However, the latter has smaller ciphertexts and faster decryption.

najmabegum commented 3 years ago

Hi, Thank you so much for your time and such a detailed explanation. It has cleared most of our confusions and doubts regarding the primes, and its use in rescaling. Also, we are now much clear on the concepts of relinearization. Thank you.

So, I implemented as per your suggestions, and changed the algorithm as follows:

Changes to parameters: size_t poly_modulus_degree = 16384; params.set_coeff_modulus(CoeffModulus::Create(poly_modulus_degree, { 60, 40, 40, 40, 40, 40, 40, 40, 60 })); double scale = pow(2.0, 40);

Changes to algorithm:

` inline Ciphertext Linear_Transform_Cipher_Sequential_2(Ciphertext ct, vector U_diagonals, GaloisKeys gal_keys,

EncryptionParameters params, bool rescale, SecretKey sk, RelinKeys relin_keys, vector

plainDiagonal, PublicKey pk, </p> <p>vector<vector<double>> all_diagonal, double scale)</p> <p>{</p> <pre><code>cout << " + Invoked Linear_Transform_Cipher C_vec . C_mat:" << endl; auto context = SEALContext::SEALContext(params); Evaluator evaluator(context); Decryptor decryptor(context, sk); Encryptor encryptor(context, pk); double scaleSet = pow(2.0, 80); // Create CKKS encoder CKKSEncoder ckks_encoder(context); int size = U_diagonals.size(); vector<Ciphertext> cipher_result_prime(size); for (int i = 0;i < size;i++) { Ciphertext ct_rot; Ciphertext ct_new; if (i == 0) { evaluator.rotate_vector(ct, -size, gal_keys, ct_rot); evaluator.add(ct, ct_rot, ct_new); } else { evaluator.rotate_vector(cipher_result_prime[i - 1], -size, gal_keys, ct_rot); evaluator.add(cipher_result_prime[i - 1], ct_rot, ct_new); } cout << " + Scale on add - input vector: " << log2(ct_new.scale()) << " bits" << endl; cout << " + Scale on add - input matrix: " << log2(U_diagonals[0].scale()) << " bits" << endl; vector<Ciphertext> ct_result(size); evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]); // Added relin and rescale cout << " + Scale on multiply - input vector - before relin (1st element): " << log2(ct_result[0].scale()) << " bits" << endl; evaluator.relinearize_inplace(ct_result[0], relin_keys); cout << " + Scale on multiply - input vector - after relin (1st element): " << log2(ct_result[0].scale()) << " bits" << endl; cout << " + Scale on multiply - input vector - before rescale (1st element): " << log2(ct_result[0].scale()) << " bits" << endl; evaluator.rescale_to_next_inplace(ct_result[0]); cout << " + Scale on multiply - input vector - after rescale (1st element): " << log2(ct_result[0].scale()) << " bits" << endl; cout << " + Scale on multiply - input matrix: " << log2(U_diagonals[0].scale()) << " bits" << endl; for (int l = 1; l < size; l++) { Ciphertext temp_rot; evaluator.rotate_vector(ct_new, l, gal_keys, temp_rot); evaluator.multiply(temp_rot, U_diagonals[l], ct_result[l]); } cout << " + Scale on multiply - input matrix - after and relin rescale (all elements): " << log2(ct_result[1].scale()) << " bits" << endl; evaluator.add_many(ct_result, cipher_result_prime[i]); cout << " + Scale on add_many - input vector - before relin: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; // Added relin evaluator.relinearize_inplace(cipher_result_prime[i], relin_keys); cout << " + Scale on add_many - input vector - after relin: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; // Added rescale if (rescale) { cout << " + Scale on add_many - input vector - before rescale: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; evaluator.rescale_to_next_inplace(cipher_result_prime[i]); cout << " + Scale on add_many - input vector - after rescale: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; } Plaintext pt_result_current; decryptor.decrypt(cipher_result_prime[i], pt_result_current); // Decode vector<double> output_result_plain; ckks_encoder.decode(pt_result_current, output_result_plain); cout << "Linear Transformation Set 2 Result:" << endl; print_partial_vector(output_result_plain, size); } return cipher_result_prime[size - 1];</code></pre> <p>}`</p> <p>When I run this above algorithm, I get an exception on the first iteration itself when the call of evaluator.add_many(ct_result, cipher_result_prime[i]); is executed Exception - Unhandled exception at 0x00007FFB4B804B89 in SealMVDemo.exe: Microsoft C++ exception: std::invalid_argument at memory location 0x0000006C7297D5A0. Below is the image for your reference.</p> <p><img src="https://user-images.githubusercontent.com/26997246/121740552-86adb600-cafd-11eb-88e4-c7c020ce3345.PNG" alt="Result1" /></p> <p>So I realised, the ct_result has different values for size and coeff_modulus_size in all its elements. So I added another loop before the evaluator.add_many(ct_result, cipher_result_prime[i]); to perform relin and rescale on ct_result <code> for (int j = 1; j < size; j++) { // Added relin and rescale evaluator.relinearize_inplace(ct_result[j], relin_keys); evaluator.rescale_to_next_inplace(ct_result[j]); } </code></p> <p>On making this change and running the algorithm, first round of multiplication (i.e. first outer iteration) runs without exception, but doesnt give expected results. Also during the second iteration of multiplication, there is an exception thrown at evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]);<br /> Unhandled exception at 0x00007FFB4B804B89 in SealMVDemo.exe: Microsoft C++ exception: std::invalid_argument at memory location 0x0000002B28F3D660. On verifying, I realised the size and coeff_modulus_size of ct_new and U_diagonals is not same. Below are images for your reference. <img src="https://user-images.githubusercontent.com/26997246/121740657-b361cd80-cafd-11eb-866c-a7ef14fd9637.PNG" alt="Result2" /> <img src="https://user-images.githubusercontent.com/26997246/121740659-b3fa6400-cafd-11eb-8b18-5a25e086ea11.PNG" alt="Result3" /> <img src="https://user-images.githubusercontent.com/26997246/121740790-e73cf300-cafd-11eb-9137-503d82899db2.PNG" alt="Result4" /></p> <p>However, for testing, I removed the rescale calls, and kept only the relinearize_inplace call after the add_many operation (Below is the algorithm for reference). With this I was able to run upto 7 iterations of below form encrypted_result_1 = matrix <em> vector encrypted_result_2 = matrix </em> encrypted_result_1 encrypted_result_3 = matrix <em> encrypted_result_2 encrypted_result_4 = matrix </em> encrypted_result_3 encrypted_result_5 = matrix <em> encrypted_result_4 encrypted_result_6 = matrix </em> encrypted_result_5 encrypted_result_7 = matrix * encrypted_result_6</p> <p>` inline Ciphertext Linear_Transform_Cipher_Sequential(Ciphertext ct, vector<Ciphertext> U_diagonals, GaloisKeys gal_keys, EncryptionParameters params, bool rescale, SecretKey sk, RelinKeys relin_keys, vector<Plaintext> plainDiagonal, PublicKey pk, vector<vector<double>> all_diagonal, double scale)</p> <p>{</p> <pre><code>cout << " + Invoked Linear_Transform_Cipher C_vec . C_mat:" << endl; auto context = SEALContext::SEALContext(params); Evaluator evaluator(context); Decryptor decryptor(context, sk); Encryptor encryptor(context, pk); double scaleSet = pow(2.0, 80); // Create CKKS encoder CKKSEncoder ckks_encoder(context); int size = U_diagonals.size(); vector<Ciphertext> cipher_result_prime(size); for (int i = 0;i < size;i++) { Ciphertext ct_rot; Ciphertext ct_new; if (i == 0) { evaluator.rotate_vector(ct, -size, gal_keys, ct_rot); evaluator.add(ct, ct_rot, ct_new); } else { evaluator.rotate_vector(cipher_result_prime[i - 1], -size, gal_keys, ct_rot); evaluator.add(cipher_result_prime[i - 1], ct_rot, ct_new); } cout << " + Scale on add - input vector: " << log2(ct_new.scale()) << " bits" << endl; cout << " + Scale on add - input matrix: " << log2(U_diagonals[0].scale()) << " bits" << endl; vector<Ciphertext> ct_result(size); evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]); cout << " + Scale on multiply - input vector : " << log2(ct_result[0].scale()) << " bits" << endl; cout << " + Scale on multiply - input matrix: " << log2(U_diagonals[0].scale()) << " bits" << endl; for (int l = 1; l < size; l++) { Ciphertext temp_rot; evaluator.rotate_vector(ct_new, l, gal_keys, temp_rot); evaluator.multiply(temp_rot, U_diagonals[l], ct_result[l]); } evaluator.add_many(ct_result, cipher_result_prime[i]); cout << " + Scale on add_many - input vector - before relin: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; // Added relin evaluator.relinearize_inplace(cipher_result_prime[i], relin_keys); cout << " + Scale on add_many - input vector - after relin: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; Plaintext pt_result_current; decryptor.decrypt(cipher_result_prime[i], pt_result_current); // Decode vector<double> output_result_plain; ckks_encoder.decode(pt_result_current, output_result_plain); cout << "Linear Transformation Set 2 Result:" << endl; print_partial_vector(output_result_plain, size); } return cipher_result_prime[size - 1];</code></pre> <p>} `</p> <p>Output: MAX BIT COUNT: 438 Coeff Modulus Back Value: 1152921504606748673 Dimension Set 2: 10</p> <pre><code> [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] ... [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] ... [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000]</code></pre> <p>Diagonal 1 Set 2 Expected: [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] ... [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000]</p> <p>Diagonal 2 Set 2 Expected: [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] ... [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000] [1.000, 1.000, 1.000, ..., 1.000, 1.000, 1.000]</p> <p>Encoding Set 2 is Complete Encrypting Set 2 is Complete</p> <ul> <li> <p>Invoked Linear_Transform_Cipher C_vec . C_mat:</p> </li> <li> <p>Scale on add - input vector: 40 bits</p> </li> <li> <p>Scale on add - input matrix: 40 bits</p> </li> <li> <p>Scale on multiply - input vector : 80 bits</p> </li> <li> <p>Scale on multiply - input matrix: 40 bits</p> </li> <li> <p>Scale on add_many - input vector - before relin: 80 bits</p> </li> <li> <p>Scale on add_many - input vector - after relin: 80 bits Linear Transformation Set 2 Result: [10.000, 10.000, 10.000, ..., 10.000, 10.000, 10.000]</p> </li> <li> <p>Scale on add - input vector: 80 bits</p> </li> <li> <p>Scale on add - input matrix: 40 bits</p> </li> <li> <p>Scale on multiply - input vector : 120 bits</p> </li> <li> <p>Scale on multiply - input matrix: 40 bits</p> </li> <li> <p>Scale on add_many - input vector - before relin: 120 bits</p> </li> <li> <p>Scale on add_many - input vector - after relin: 120 bits Linear Transformation Set 2 Result: [100.000, 100.000, 100.000, ..., 100.000, 100.000, 100.000]</p> </li> <li> <p>Scale on add - input vector: 120 bits</p> </li> <li> <p>Scale on add - input matrix: 40 bits</p> </li> <li> <p>Scale on multiply - input vector : 160 bits</p> </li> <li> <p>Scale on multiply - input matrix: 40 bits</p> </li> <li> <p>Scale on add_many - input vector - before relin: 160 bits</p> </li> <li> <p>Scale on add_many - input vector - after relin: 160 bits Linear Transformation Set 2 Result: [1000.000, 1000.000, 1000.000, ..., 1000.000, 1000.000, 1000.000]</p> </li> <li> <p>Scale on add - input vector: 160 bits</p> </li> <li> <p>Scale on add - input matrix: 40 bits</p> </li> <li> <p>Scale on multiply - input vector : 200 bits</p> </li> <li> <p>Scale on multiply - input matrix: 40 bits</p> </li> <li> <p>Scale on add_many - input vector - before relin: 200 bits</p> </li> <li> <p>Scale on add_many - input vector - after relin: 200 bits Linear Transformation Set 2 Result: [10000.002, 10000.002, 10000.002, ..., 10000.002, 10000.002, 10000.002]</p> </li> <li> <p>Scale on add - input vector: 200 bits</p> </li> <li> <p>Scale on add - input matrix: 40 bits</p> </li> <li> <p>Scale on multiply - input vector : 240 bits</p> </li> <li> <p>Scale on multiply - input matrix: 40 bits</p> </li> <li> <p>Scale on add_many - input vector - before relin: 240 bits</p> </li> <li> <p>Scale on add_many - input vector - after relin: 240 bits Linear Transformation Set 2 Result: [100000.017, 100000.017, 100000.017, ..., 100000.017, 100000.017, 100000.017]</p> </li> <li> <p>Scale on add - input vector: 240 bits</p> </li> <li> <p>Scale on add - input matrix: 40 bits</p> </li> <li> <p>Scale on multiply - input vector : 280 bits</p> </li> <li> <p>Scale on multiply - input matrix: 40 bits</p> </li> <li> <p>Scale on add_many - input vector - before relin: 280 bits</p> </li> <li> <p>Scale on add_many - input vector - after relin: 280 bits Linear Transformation Set 2 Result: [1000000.170, 1000000.170, 1000000.170, ..., 1000000.169, 1000000.169, 1000000.170]</p> </li> <li> <p>Scale on add - input vector: 280 bits</p> </li> <li> <p>Scale on add - input matrix: 40 bits</p> </li> <li> <p>Scale on multiply - input vector : 320 bits</p> </li> <li> <p>Scale on multiply - input matrix: 40 bits</p> </li> <li> <p>Scale on add_many - input vector - before relin: 320 bits</p> </li> <li> <p>Scale on add_many - input vector - after relin: 320 bits Linear Transformation Set 2 Result: [10000001.704, 10000001.700, 10000001.698, ..., 10000001.692, 10000001.688, 10000001.698]</p> </li> </ul> <p>Here I notice that noise adds up in results from the 4th iteration onwards.</p> <p>So after this analysis, our doubts are: </p> <ol> <li>Rescaling reduces noise - If we apply rescaling, the code seems to break at some part, due to a mismatch in scales between the initial rotated input vector (ct_new) and elements of the input matrix. Also, when we tried applying rescaling, the results were way off the expected values How do we reduce that?</li> <li>Our requirement is to run such sequential multiplications on huge matrices, say dimension > 200*200. What is the maximum value for poly_modulus_degree that we could use so that the number of primes in the vector can be accordingly chosen as per the number of multiplications we need? </li> </ol> <p>Again, thank you for your effort and helping us out.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/WeiDaiWD"><img src="https://avatars.githubusercontent.com/u/5634305?v=4" />WeiDaiWD</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <blockquote> <p>When I run this above algorithm, I get an exception on the first iteration itself when the call of evaluator.add_many(ct_result, cipher_result_prime[i]); is executed Exception - Unhandled exception at 0x00007FFB4B804B89 in SealMVDemo.exe: Microsoft C++ exception: std::invalid_argument at memory location 0x0000006C7297D5A0.</p> </blockquote> <p>This is because that <code>ct_result[0]</code> has size 3 after multiplication and is relinearized to size 2. While <code>ct_result[i]</code> for <code>i>0</code> have sizes 3 when being added, their sizes do match with <code>ct_result[0]</code>. I'm sorry that I didn't see this earlier. So if you remove the relinearization on <code>ct_result[0]</code>, it should work fine.</p> <blockquote> <p>Also during the second iteration of multiplication, there is an exception thrown at evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]); Unhandled exception at 0x00007FFB4B804B89 in SealMVDemo.exe: Microsoft C++ exception: std::invalid_argument at memory location 0x0000002B28F3D660. On verifying, I realised the size and coeff_modulus_size of ct_new and U_diagonals is not same.</p> </blockquote> <p>You are right. The numbers of primes used to represent ciphertexts that are multiplied or added must match. Each time a rescale is performed on a ciphertext, the number of primes used to represent this ciphertext is reduced by 1. You need to perform <a href="https://github.com/microsoft/SEAL/blob/97e4b8ddbc1105fe8338f8c18ff3cafcba686266/native/src/seal/evaluator.h#L376"><code>mod_switch_to_next_inplace</code></a> to reduce the number of primes used to represent <code>U_diagonals[i]</code> without touching its scale, rather than removing other rescaling operations. This is a pain, truly.</p> <blockquote> <p>If we apply rescaling, the code seems to break at some part, due to a mismatch in scales between the initial rotated input vector (ct_new) and elements of the input matrix.</p> </blockquote> <p>Even if you have correctly performed the right numbers of rescaling on ciphertext operands, they might still have mismatch in scales. For example, ct_0 and ct_1 has scales 2^40, then ct_01 = rescale(multiply(ct_0, ct_1)) will have scale (2^80 / p) where p is the 40-bit prime chosen in your parameter. Now that you have ct_2 with scale 2^40 encrypted at the same level (using the same number of primes) with ct_0 and ct_1, how do you multiply ct_2 with ct_01? Following the above instruction, you should perform <code>mod_switch</code> on ct_2. Now their levels match at least. What's the scale of ct_2 now? It's 2^40, unequal to (2^80 / p). You can perform ct_01 * ct_2 since multiplications do not require input scales to match, but you cannot perform ct_01 + ct_2 since additions do require matching scales in inputs. The solution is simply <code>ct_2.scale() = ct_01.scale()</code>. You force them to be equal.</p> <blockquote> <p>Also, when we tried applying rescaling, the results were way off the expected values How do we reduce that?</p> </blockquote> <p>I am not sure when you tried with rescaling, do all ciphertexts have size 2 at the time of decryption? Because larger-sized ciphertexts leads to much more noise/error. In your test, where data are all 1.0000..., the error is so small. Then rescaling basically brings scale (the decimal point) closer to error, making the error to appear larger. In other scenarios, where data are 0.765, 1.367, 3.462, ..., and the computation has more levels, the error is large. Then rescaling does not affect the gap between error and scale (the decimal point), i.e., does not enlarge or reduce error. The most significant benefit of rescaling is that you only need one prime to represent a ciphertext with scale 2^40 but need at least two primes to represent a ciphertext with scale 2^80 and so on. The size of ciphertexts returned for decryption is smaller.</p> <blockquote> <p>Our requirement is to run such sequential multiplications on huge matrices, say dimension > 200*200. What is the maximum value for poly_modulus_degree that we could use so that the number of primes in the vector can be accordingly chosen as per the number of multiplications we need?</p> </blockquote> <p>You've already known that it is related to security. The HE standard only provides suggested parameters up to 32768, <a href="https://github.com/microsoft/SEAL/blob/97e4b8ddbc1105fe8338f8c18ff3cafcba686266/native/src/seal/util/globals.cpp#L64">here in SEAL for 128-bit security</a>. So the product of primes in your vector should have bitsize less than that bound. If you have to use more primes which exceeds the security bound for degree 32768, you can safely double the bound when you double the degree which is usually still 128-bit secure. For example, for degree 65536, 2*881=1762-bit is 128-bit secure. The limit in SEAL is set to 131072 at <a href="https://github.com/microsoft/SEAL/blob/97e4b8ddbc1105fe8338f8c18ff3cafcba686266/native/src/seal/util/defines.h#L52">here</a>. You may change that limit if you have to. But it is not recommended since the performance will be very bad and memory consumption / ciphertext / key sizes are massive.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/najmabegum"><img src="https://avatars.githubusercontent.com/u/26997246?v=4" />najmabegum</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>Hi, Thank you again for such a detailed answer. We have made good progress both in our work and understanding with your help.</p> <p>I was initially trying to balance the size of input matrix elements by rescaling instead of using mod_switch_to_next_inplace. However, I made the changes as you suggested, and added evaluator.mod_switch_to_next_inplace(U_diagonals[r]); whenever there is a mismatch between the numbers of primes in the input matrix and the input vector. This worked!</p> <p>I could achieve the same 7 iterations again:</p> <p>encrypted_result_1 = matrix <em> vector encrypted_result_2 = matrix </em> encrypted_result_1 encrypted_result_3 = matrix <em> encrypted_result_2 encrypted_result_4 = matrix </em> encrypted_result_3 encrypted_result_5 = matrix <em> encrypted_result_4 encrypted_result_6 = matrix </em> encrypted_result_5 encrypted_result_7 = matrix * encrypted_result_6</p> <p>I tested with an input matrix with only 1 as elements. It worked well, there was some noise added from the 4th iteration onwards. So I tried testing with another input matrix:</p> <p>1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0</p> <p>Input vector: 1 2 3 4 5 6 7</p> <p>Output is as follows:</p> <pre><code>Encoding Set 2 is Complete Encrypting Set 2 is Complete + Invoked Linear_Transform_Cipher C_vec . C_mat: + Scale on add - input vector: 40 bits + Scale on add - input matrix: 40 bits + Scale on multiply - input vector : 80 bits + Scale on multiply - input matrix: 40 bits + Scale on multiply - input matrix - after and relin rescale (all elements): 80 bits + Scale on add_many - input vector - before relin: 80 bits + Scale on add_many - input vector - after relin: 80 bits + Scale on add_many - input vector - before rescale: 80 bits + Scale on add_many - input vector - after rescale: 40 bits Linear Transformation Set 2 Result: Actual 140, 336, 532, 608, 420, 224, 35, Expected [140, 336, 532, 608, 420, 224, 35, ] + Scale on add - input vector: 40 bits + Scale on add - input matrix: 40 bits Mod switch to next inplace on InputMatrix : 0 + Scale on multiply - input vector : 80 bits + Scale on multiply - input matrix: 40 bits + Scale on multiply - input matrix - after and relin rescale (all elements): 80 bits + Scale on add_many - input vector - before relin: 80 bits + Scale on add_many - input vector - after relin: 80 bits + Scale on add_many - input vector - before rescale: 80 bits + Scale on add_many - input vector - after rescale: 40 bits Linear Transformation Set 2 Result: Actual 8529, 24594, 40659, 52204, 37371, 21306, 5276, Expected [8529, 24594, 40659, 52204, 37371, 21306, 5276, ] + Scale on add - input vector: 40 bits + Scale on add - input matrix: 40 bits Mod switch to next inplace on InputMatrix : 0 + Scale on multiply - input vector : 80 bits + Scale on multiply - input matrix: 40 bits + Scale on multiply - input matrix - after and relin rescale (all elements): 80 bits + Scale on add_many - input vector - before relin: 80 bits + Scale on add_many - input vector - after relin: 80 bits + Scale on add_many - input vector - before rescale: 80 bits + Scale on add_many - input vector - after rescale: 40 bits Linear Transformation Set 2 Result: Actual 740133, 2.06971e+06, 3.39928e+06, 4.30492e+06, 3.05865e+06, 1.72907e+06, 404777, Expected [740133, 2.06971e+06, 3.39928e+06, 4.30492e+06, 3.05865e+06, 1.72907e+06, 404777, ] + Scale on add - input vector: 40 bits + Scale on add - input matrix: 40 bits Mod switch to next inplace on InputMatrix : 0 + Scale on multiply - input vector : 80 bits + Scale on multiply - input matrix: 40 bits + Scale on multiply - input matrix - after and relin rescale (all elements): 80 bits + Scale on add_many - input vector - before relin: 80 bits + Scale on add_many - input vector - after relin: 80 bits + Scale on add_many - input vector - before rescale: 80 bits + Scale on add_many - input vector - after rescale: 40 bits Linear Transformation Set 2 Result: Actual 6.07982e+07, 1.70744e+08, 2.8069e+08, 3.56178e+08, 2.53332e+08, 1.43387e+08, 3.38458e+07, Expected [6.07982e+07, 1.70744e+08, 2.8069e+08, 3.56178e+08, 2.53332e+08, 1.43387e+08, 3.38458e+07, ] + Scale on add - input vector: 40 bits + Scale on add - input matrix: 40 bits Mod switch to next inplace on InputMatrix : 0 + Scale on multiply - input vector : 80 bits + Scale on multiply - input matrix: 40 bits + Scale on multiply - input matrix - after and relin rescale (all elements): 80 bits + Scale on add_many - input vector - before relin: 80 bits + Scale on add_many - input vector - after relin: 80 bits + Scale on add_many - input vector - before rescale: 80 bits + Scale on add_many - input vector - after rescale: 40 bits Linear Transformation Set 2 Result: Actual 5.03297e+09, 1.41258e+10, 2.32186e+10, 2.94547e+10, 2.09465e+10, 1.18537e+10, 2.79472e+09, Expected [5.03297e+09, 1.41258e+10, 2.32186e+10, 2.94547e+10, 2.09465e+10, 1.18537e+10, 2.79472e+09, ] + Scale on add - input vector: 40 bits + Scale on add - input matrix: 40 bits Mod switch to next inplace on InputMatrix : 0 + Scale on multiply - input vector : 80 bits + Scale on multiply - input matrix: 40 bits + Scale on multiply - input matrix - after and relin rescale (all elements): 80 bits + Scale on add_many - input vector - before relin: 80 bits + Scale on add_many - input vector - after relin: 80 bits + Scale on add_many - input vector - before rescale: 80 bits + Scale on add_many - input vector - after rescale: 40 bits Linear Transformation Set 2 Result: Actual 4.16177e+11, 1.16817e+12, 1.92016e+12, 2.43597e+12, 1.73236e+12, 9.80374e+11, 2.3118e+11, Expected [4.16177e+11, 1.16817e+12, 1.92016e+12, 2.43597e+12, 1.73236e+12, 9.80374e+11, 2.3118e+11, ] + Scale on add - input vector: 40 bits + Scale on add - input matrix: 40 bits Mod switch to next inplace on InputMatrix : 0 + Scale on multiply - input vector : 80 bits + Scale on multiply - input matrix: 40 bits + Scale on multiply - input matrix - after and relin rescale (all elements): 80 bits + Scale on add_many - input vector - before relin: 80 bits + Scale on add_many - input vector - after relin: 80 bits + Scale on add_many - input vector - before rescale: 80 bits + Scale on add_many - input vector - after rescale: 40 bits Linear Transformation Set 2 Result: Actual [-4.60615e+06, 1.63849e+07, 1.0539e+08, -1.53406e+07, -6.06025e+07, -5.23904e+07, -4.66414e+07] Expected [3.44192e+13, 9.66099e+13, 1.58801e+14, 2.01458e+14, 1.43269e+14, 8.10778e+13, 1.91183e+13, ]</code></pre> <p>If you notice, all iterations give the result same as expected result except the final iteration, which is the important one for us, as we hope to directly work with the final value The below algorithm was used.</p> <pre><code>inline Ciphertext Linear_Transform_Cipher_Sequential_2(Ciphertext ct, vector<Ciphertext> U_diagonals, GaloisKeys gal_keys, EncryptionParameters params, bool rescale, SecretKey sk, RelinKeys relin_keys, vector<Plaintext> plainDiagonal, PublicKey pk, vector<vector<double>> all_diagonal, double scale) { cout << " + Invoked Linear_Transform_Cipher C_vec . C_mat:" << endl; auto context = SEALContext::SEALContext(params); Evaluator evaluator(context); Decryptor decryptor(context, sk); Encryptor encryptor(context, pk); double scaleSet = pow(2.0, 80); // Create CKKS encoder CKKSEncoder ckks_encoder(context); int size = U_diagonals.size(); vector<Ciphertext> cipher_result_prime(size); for (int i = 0;i < size;i++) { Ciphertext ct_rot; Ciphertext ct_new; if (i == 0) { evaluator.rotate_vector(ct, -size, gal_keys, ct_rot); evaluator.add(ct, ct_rot, ct_new); } else { evaluator.rotate_vector(cipher_result_prime[i - 1], -size, gal_keys, ct_rot); evaluator.add(cipher_result_prime[i - 1], ct_rot, ct_new); } cout << " + Scale on add - input vector: " << log2(ct_new.scale()) << " bits" << endl; cout << " + Scale on add - input matrix: " << log2(U_diagonals[0].scale()) << " bits" << endl; //Added mod_switch_to_next_inplace int ctNewCoeff = ct_new.coeff_modulus_size(); int inputMatrixCoeff = U_diagonals[0].coeff_modulus_size(); int differenceInCoeff = inputMatrixCoeff - ctNewCoeff; for (int p = 0; p < differenceInCoeff;p++) { cout << " Mod switch to next inplace on InputMatrix : " << p << endl; for (int r = 0; r < size; r++) { evaluator.mod_switch_to_next_inplace(U_diagonals[r]); } } vector<Ciphertext> ct_result(size); evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]); cout << " + Scale on multiply - input vector : " << log2(ct_result[0].scale()) << " bits" << endl; cout << " + Scale on multiply - input matrix: " << log2(U_diagonals[0].scale()) << " bits" << endl; for (int l = 1; l < size; l++) { Ciphertext temp_rot; evaluator.rotate_vector(ct_new, l, gal_keys, temp_rot); evaluator.multiply(temp_rot, U_diagonals[l], ct_result[l]); } cout << " + Scale on multiply - input matrix - after and relin rescale (all elements): " << log2(ct_result[1].scale()) << " bits" << endl; evaluator.add_many(ct_result, cipher_result_prime[i]); cout << " + Scale on add_many - input vector - before relin: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; evaluator.relinearize_inplace(cipher_result_prime[i], relin_keys); cout << " + Scale on add_many - input vector - after relin: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; if (rescale) { cout << " + Scale on add_many - input vector - before rescale: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; evaluator.rescale_to_next_inplace(cipher_result_prime[i]); cout << " + Scale on add_many - input vector - after rescale: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; } Plaintext pt_result_current; decryptor.decrypt(cipher_result_prime[i], pt_result_current); // Decode vector<double> output_result_plain; ckks_encoder.decode(pt_result_current, output_result_plain); cout << "Linear Transformation Set 2 Result:" << endl; /*print_partial_vector(output_result_plain, size); */ for (unsigned int row = 0; row < size; row++) { cout << output_result_plain[row] << ", "; } } return cipher_result_prime[size - 1]; }</code></pre> <p>During the final iteration, the values in ct_new (input vector) and the input matrix are shown in the images below.</p> <p><img src="https://user-images.githubusercontent.com/26997246/121791526-f9488f80-cbea-11eb-8de5-863f7c3ab302.PNG" alt="Res1" /> <img src="https://user-images.githubusercontent.com/26997246/121791527-f9e12600-cbea-11eb-989c-bb59b660744b.PNG" alt="Res2" /></p> <p>I'm not able to figure out why the final iteration gives such unexpected values.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/WeiDaiWD"><img src="https://avatars.githubusercontent.com/u/5634305?v=4" />WeiDaiWD</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>This is the logic that I usually follow when dealing with incorrect results. First, the previous iteration has really accurate result, so it is unlikely to have a very large error in the last iteration to destroy the messages. Then I look at the expected values. They grow larger along with iterations. So it might be that the integral value before decimal point causes an overflow. When you decrypt with a 60-bit prime and 40-bit scale, you can only preserve 60-40=20-bit of precision before decimal point. The expected values in the last iteration are <code>e+13</code> and might have caused an overflow. You may wonder how come <code>1.16817e+12</code> is correct from the second last iteration. Your real messages are transformed into new real values during CKKS encoding. So the 20-bit preserved before decimal point is in the encoded values which is different from what you see after decoding. You are probably lucky in the second last iteration, just below the threshold. If my guess is right, <strong>insert another prime after the first 60-bit prime in your parameters</strong>. It does not have to be as large as 40- or 60-bit. Maybe 20-bit is enough to give correct result.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/renis1235"><img src="https://avatars.githubusercontent.com/u/11201041?v=4" />renis1235</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>@WeiDaiWD through your instructions and tips we were able to reach up to 15 iterations of the above mentioned operations with a 15 x 15 dense Matrix. It worked (a bit slow) but it worked. Your comments and will to help has helped us a lot, so thank you so very much! We could also share a couple of plots before closing the issue if you would like to take a look at them...</p> <p>Once again, <strong>thank you!</strong></p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/WeiDaiWD"><img src="https://avatars.githubusercontent.com/u/5634305?v=4" />WeiDaiWD</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>Great. I'd appreciate that.</p> <p>There are two more things that I want to mention in case they can help.</p> <ol> <li>You can pack multiple 15x15 matrices or 15x1 vectors into a single ciphertext so that the overhead can be amortized.</li> <li>We have released <a href="https://github.com/microsoft/EVA">EVA Compiler</a> to automate selecting parameters, matching scales, and inserting rescaling and relinearization. If you are not on a close deadline and would like to find out how to implement similar applications faster in future, it's worthy of trying.</li> </ol> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/renis1235"><img src="https://avatars.githubusercontent.com/u/11201041?v=4" />renis1235</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>Hello @WeiDaiWD </p> <p>Here the above mentioned plots:</p> <p><img src="https://user-images.githubusercontent.com/11201041/122460659-b142b780-cfb2-11eb-86c2-b3eba94f570b.png" alt="1606_MV_MSE_Dense" /> <img src="https://user-images.githubusercontent.com/11201041/122460661-b1db4e00-cfb2-11eb-8907-9a746c7e96f0.png" alt="1606_MVErrorNorms_Dense_WithRescaling" /> <img src="https://user-images.githubusercontent.com/11201041/122460662-b273e480-cfb2-11eb-98b3-b9a4c6eebb5a.png" alt="1606_MV_MaxError_Dense" /> <img src="https://user-images.githubusercontent.com/11201041/122460663-b273e480-cfb2-11eb-9d96-cf910d118d86.png" alt="1606_MV_MaxNormalisedError_Dense" /></p> <ol> <li>The packing of multiple matrices in one Ciphertext seems interesting, we will research a bit more about the idea and how it can help our project further</li> <li>We have read about EVA, but decided that it would be time consuming to mingle with it and SEAL at the same time. Our use case is pretty specific (Matrix x Vector) does EVA cover that too?</li> </ol> <p>We have another proposal for you :) We are Master students of the Frankfurt University of Applied Sciences, we (me and Najma) have been researching this for about 1.5 months and we are planning to write an Article/Scientific Paper (together with the leading Professor) on the findings that we researched. You were a big help the last week, thus, comes the question, do you want to be part of it? If yes, we can further discuss everything per E-Mail.</p> <p>Thank you.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/WeiDaiWD"><img src="https://avatars.githubusercontent.com/u/5634305?v=4" />WeiDaiWD</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>I recommend you to subscribe to EVA's future releases. Soon we will have vector and matrix support in EVA/python, which will be much easier to learn as what it is now.</p> <p>I don't think I have time to help. If you have more questions, feel free to post them here on GitHub. Your questions are valuable questions that can help other developers too.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/renis1235"><img src="https://avatars.githubusercontent.com/u/11201041?v=4" />renis1235</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>Hello,</p> <p>Thank you.</p> <p>We have the following problem now. When doing several operations in a row as discussed above, with a matrix which is dense and with a dimension > 300 we cannot perform enough iterations. What I mean by that, is: depending on the dimension of the matrix, the bigger it is the less iterations we can do. After a specific number of iterations (7 for a dense matrix of dimension 300) an error is thrown and the program crashes. This is the error that we get (thrown in the <code>evaluator.h</code> file):</p> <p><img src="https://user-images.githubusercontent.com/11201041/122965252-49acb380-d388-11eb-83ee-70be7e4e7055.png" alt="image" /> <img src="https://user-images.githubusercontent.com/11201041/122965383-6ba63600-d388-11eb-8fc9-578d2be73d33.png" alt="image" /></p> <p>These are the parameters: </p> <pre><code> /*Set Seal context*/ size_t poly_modulus_degree = 32768; EncryptionParameters params(scheme_type::ckks); params.set_poly_modulus_degree(poly_modulus_degree); cout << "MAX BIT COUNT: " << CoeffModulus::MaxBitCount(poly_modulus_degree) << endl; params.set_coeff_modulus(CoeffModulus::Create(poly_modulus_degree, { 60,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,60 })); auto context = SEALContext::SEALContext(params);</code></pre> <p>This is the scale:</p> <pre><code>double scale = pow(2.0, 40);</code></pre> <p>And this is the function that is doing the iterations and the multiplications:</p> <pre><code>inline Ciphertext Linear_Transform_Cipher_Sequential_Dense(Ciphertext ct, vector<Ciphertext> U_diagonals, GaloisKeys gal_keys, EncryptionParameters params, bool rescale, SecretKey sk, RelinKeys relin_keys, int iteration) { cout << " + Invoked Linear_Transform_Cipher C_vec . C_mat:" << endl; auto context = SEALContext::SEALContext(params); Evaluator evaluator(context); Decryptor decryptor(context, sk); CKKSEncoder ckks_encoder(context); PROCESS_MEMORY_COUNTERS_EX pmc; GetProcessMemoryInfo(GetCurrentProcess(), (PROCESS_MEMORY_COUNTERS*)&pmc, sizeof(pmc)); int size = U_diagonals.size(); vector<Ciphertext> cipher_result_prime(size); for (int i = 0;i < iteration;i++) { cout << "\nIteration: " << i << endl; /*Initialization*/ auto start = chrono::high_resolution_clock::now(); Ciphertext ct_rot; Ciphertext ct_new; /*Rotate and Add*/ if (i == 0) { evaluator.rotate_vector(ct, -size, gal_keys, ct_rot); evaluator.add(ct, ct_rot, ct_new); } else { evaluator.rotate_vector(cipher_result_prime[i - 1], -size, gal_keys, ct_rot); evaluator.add(cipher_result_prime[i - 1], ct_rot, ct_new); } cout << "\n + Scale on add - input vector: " << log2(ct_new.scale()) << " bits" << endl; cout << "\n + Scale on add - input matrix: " << log2(U_diagonals[0].scale()) << " bits" << endl; /*Mod inplace on input matrix*/ int ctNewCoeff = ct_new.coeff_modulus_size(); int inputMatrixCoeff = U_diagonals[0].coeff_modulus_size(); int differenceInCoeff = inputMatrixCoeff - ctNewCoeff; for (int p = 0; p < differenceInCoeff;p++) { cout << " Mod switch to next inplace on InputMatrix : " << p << endl; for (int r = 0; r < size; r++) { evaluator.mod_switch_to_next_inplace(U_diagonals[r]); } } /*Multiply with all rows*/ vector<Ciphertext> ct_result(size); evaluator.multiply(ct_new, U_diagonals[0], ct_result[0]); cout << "\n + Scale on multiply - input vector : " << log2(ct_result[0].scale()) << " bits" << endl; cout << "\n + Scale on multiply - input matrix: " << log2(U_diagonals[0].scale()) << " bits" << endl; for (int l = 1; l < size; l++) { Ciphertext temp_rot; evaluator.rotate_vector(ct_new, l, gal_keys, temp_rot); evaluator.multiply(temp_rot, U_diagonals[l], ct_result[l]); } /*Add many, relin, rescale*/ evaluator.add_many(ct_result, cipher_result_prime[i]); cout << "\n + Scale on add_many - input vector - before relin: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; evaluator.relinearize_inplace(cipher_result_prime[i], relin_keys); cout << "\n + Scale on add_many - input vector - after relin: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; if (rescale) { cout << "\n + Scale on add_many - input vector - before rescale: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; evaluator.rescale_to_next_inplace(cipher_result_prime[i]); cout << "\n + Scale on add_many - input vector - after rescale: " << log2(cipher_result_prime[i].scale()) << " bits" << endl; } auto stop = chrono::high_resolution_clock::now(); /*Print time*/ auto duration = chrono::duration_cast<chrono::microseconds>(stop - start); cout << "\nTime to compute Iteration: " << i << duration.count() << " microseconds" << endl; cout << "\nVirtual Memory to compute Iteration: " << i + 1 << pmc.PrivateUsage << endl; cout << "\nPhysical Memory (RAM) to compute Iteration: " << i + 1 << pmc.WorkingSetSize << endl; /*Test with decrypt - to be removed*/ Plaintext pt_result_current; decryptor.decrypt(cipher_result_prime[i], pt_result_current); vector<double> output_result_plain; ckks_encoder.decode(pt_result_current, output_result_plain); cout << "\nLinear Transformation Set 2 Result:" << endl; for (unsigned int row = 0; row < size; row++) { cout << output_result_plain[row] << ", "; } } return cipher_result_prime[size - 1]; }</code></pre> <p>Please ignore the time and memory code lines, we are trying to check the memory consumption with each iteration. It is also taking quite a lot of time to perform these iterations (+20 mins in some cases).</p> <p>Thank you Looking forward to your next comment/tip :).</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/renis1235"><img src="https://avatars.githubusercontent.com/u/11201041?v=4" />renis1235</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>Hello @WeiDaiWD,</p> <p>I think we found the reason why the error is thrown. A huge amount of memory is consumed by doing the operations (and even before when encrypting). And maybe there comes a time where the program cannot allocate anymore. See the below graph:</p> <p><img src="https://user-images.githubusercontent.com/11201041/123144089-96ad8a00-d45b-11eb-9122-511e9e6fe389.png" alt="Memory2_Private" /></p> <p>This is per iteration, so the memory consumption is getting higher and higher / iteration.</p> <p>We played around a bit with Visual Studio Profiler and this is what it showed just at the first iteration...</p> <p><img src="https://user-images.githubusercontent.com/11201041/123144400-ee4bf580-d45b-11eb-8ee5-f3f8649e75bd.png" alt="image" /></p> <p>Is this a normal behaviour of the Library? Or is our implementation faulty? Is it normal that the <code>create_galois_keys</code> function consumes so much memory? And it is unclear how is it called so many times. Maybe the library calls it implicitly...</p> <p>Also when we did a 15 x 15 Matrix and 15 iterations, the amount of memory consumed is very high. That must be also the reason for the huge amount of time it is requiring for each iteration (see below graph)</p> <p><img src="https://user-images.githubusercontent.com/11201041/123144883-7a5e1d00-d45c-11eb-816f-07cb1cf145d9.png" alt="Performance" /></p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/WeiDaiWD"><img src="https://avatars.githubusercontent.com/u/5634305?v=4" />WeiDaiWD</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>The number of primes in your parameters is <code>20</code>. For each rotation, <code>GaloisKeys</code> requires has a key size <code>(20 - 1) * 2 * 32768 * 20 * 8 B = 190 MB</code>. I guess that you created <code>GaloisKeys</code> in the following method:</p> <pre><code class="language-c++">GaloisKeys galois_keys; KeyGenerator::create_galois_keys(galois_keys);</code></pre> <p>This generates <code>GaloisKeys</code> for power-of-two rotation steps. The size of <code>GaloisKeys</code> will be <code>31 * 190 MB = 5.75 GB</code> which matches with what Visual Studio reported.</p> <p>What I'm confused is why memory requirement increased by <code>10 GB</code> in each iteration. Obviously you didn't create a new <code>GaloisKeys</code> during each iteration. Can you try to use <code>const ClassName &input</code> instead of <code>ClassName input</code> in your function?</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/renis1235"><img src="https://avatars.githubusercontent.com/u/11201041?v=4" />renis1235</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>Thank you for the information. The <code>GaloisKeys</code> is created that way, correct.</p> <p>I tried sending everything as a <code>const</code> variable plus with the <code>&</code> operator so that the same object is transmitted and not a copy, but with no success.</p> <p>Another thing that I noticed is that after finishing 7 iterations, the memory allocated stays the same, I know that C++ has no GC, but doesn't SEAL take care of the overhead and dropping of the before used and not anymore needed data? Maybe we could clean the memory after each iteration so that it doesn't get consumed more and more.</p> <p>In the <code>C#</code> version of the library, I have seen a GC managment piece of code, but in C++ still not.</p> <p>Thank you.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/WeiDaiWD"><img src="https://avatars.githubusercontent.com/u/5634305?v=4" />WeiDaiWD</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <blockquote> <p>I tried sending everything as a const variable plus with the & operator so that the same object is transmitted and not a copy, but with no success.</p> </blockquote> <p>This is weird. From what I see all parameters are input only. It won't matter actually, since the increase of memory consumption happens in loops.</p> <p>SEAL manages a memory pool and all data types rely on <code>seal::Pointer</code> which is allocated/deallocated via memory pool. Together they work like a garbage collector: when a variable goes out of scope, the memory that it points to is marked free in memory pool, but not released. New allocation requests to memory pool will try to reuse free chunks in memory pool first. And when memory pool is out of memory, it allocates more than what it needs from system as a reservoir.</p> <p>I still don't see why you have that issue. Would you try to see at which line memory use increases in the iterations?</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/renis1235"><img src="https://avatars.githubusercontent.com/u/11201041?v=4" />renis1235</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>So from our observations, we got the following data:</p> <p>1) 1.5 GB is being allocated from the program after creating the parameters and the keys 2) ~3.5 GB is being allocated from the program after creating the parameters, the keys, and encoding & encryption of the matrix + the vector. 3) After the iterations start, then it starts going crazy...</p> <p>We have to check which line is the "culprit".</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/renis1235"><img src="https://avatars.githubusercontent.com/u/11201041?v=4" />renis1235</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>This loop consumes the most memory each time it is run:</p> <pre><code> for (int l = 1; l < size; l++) { Ciphertext temp_rot; evaluator.rotate_vector(ct_new, l, gal_keys, temp_rot); evaluator.multiply(temp_rot, U_diagonals[l], ct_result[l]); }</code></pre> <p>After removing unnecessary code , encoding and decryption we got the following memory consumption:</p> <ul> <li>After creation of the parameters and keys -> 670 MB</li> <li>After encoding & Encryption -> 1.7 GB</li> <li>After the first Iteration -> ~ 3 GB</li> </ul> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/najmabegum"><img src="https://avatars.githubusercontent.com/u/26997246?v=4" />najmabegum</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>Hi, @WeiDaiWD,</p> <p>Just an addition to the data posted above by Renis. The memory shown in the graphs above is the working set size. We are recording it for every iteration to see its behavior.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/WeiDaiWD"><img src="https://avatars.githubusercontent.com/u/5634305?v=4" />WeiDaiWD</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>Sorry for the late response, I've been busy lately. Shall we move the discussion to the other issue?</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/najmabegum"><img src="https://avatars.githubusercontent.com/u/26997246?v=4" />najmabegum</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <p>@WeiDaiWD Thank you for your response. We created a new issue for the memory consumption discussion. You can find it here <a href="https://github.com/microsoft/SEAL/issues/365">High Memory consumption when using SEAL & CKKS Scheme</a></p> </div> </div> <div class="page-bar-simple"> </div> <div class="footer"> <ul class="body"> <li>© <script> document.write(new Date().getFullYear()) </script> Githubissues.</li> <li>Githubissues is a development platform for aggregating issues.</li> </ul> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.5.1/dist/jquery.min.js"></script> <script src="/githubissues/assets/js.js"></script> <script src="/githubissues/assets/markdown.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/highlight.min.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/languages/go.min.js"></script> <script> hljs.highlightAll(); </script> </body> </html>

microsoft / SEAL

Receiving different results for the same (encrypted) calculation on different runs (CKKS Scheme) #351

Performing several calculations with the same values results in different outcomes each time