Closed DylanWangWQF closed 3 years ago
Because CKKS generate approximate results, right?
That has nothing to do with the reason why matrix multiplication in CKKS is faster than that in BFV. The reason is CKKS ciphertext multiplication algorithm is faster than BFV.
But can we optimize the performance of rotation?
Rotations in CKKS and BFV have the same performance.
If both matrix, vector are ciphertext, then CKKS is faster because the multiplication of CKKS is faster than BFV. This is algorithmic and we can do little with it.
But if one of operand is plaintext. Thing can be different.
Basically, SEAL-BFV keeps the plaintext (polynomial) in its coefficient form while the SEAL-CKKS uses the ntt form.
As a result, the plaintext-ciphertext multiplication for BFV would be slower than CKKS (because BFV's plaintext needed to be converted to its NTT form which is costly). On the other hand, CKKS just shift this work load to the encoding phase. If you put the "encoding" phase and evaluation together, I think CKKS/BFV would give the same performance.
This is an implementation decision.
@WeiDaiWD @fionser Thank you for your reply! And sorry, I just test these operations by employing BGV in HElib and HEAAN until now. Have not tested BFV. But the comparison of performance should be similar to SEAL? From the test results, it seems that the performance would be quite different if we use the larger parameters.
These are the test code and results: HEAAN test:
//Params: logN = 13; logp = 35; logq = 55.
Context context(/*logN=*/13,/*logQ =*/55);
SecretKey secretKey(/*logN=*/13, /*h=*/64);
Scheme scheme(secretKey, context);
scheme.addLeftRotKeys(secretKey);
scheme.addRightRotKeys(secretKey);
complex<double>* cmsg1 = new complex<double>[256];
complex<double>* cmsg2 = new complex<double>[256];
for (int i = 0; i < 256; i++) {
double dtemp;
conv(dtemp, (i % 2));
cmsg1[i].real(dtemp);
cmsg2[i].real(dtemp);
}
// Encryption
Ciphertext ctxt1;
Ciphertext ctxt2;
auto start= chrono::steady_clock::now();
ctxt1 = scheme.encrypt(cmsg1, /*slots=*/256, /*logp=*/35, /*logq=*/55);
ctxt2 = scheme.encrypt(cmsg1, /*slots=*/256, /*logp=*/35, /*logq=*/55);
auto end = std::chrono::steady_clock::now();
auto diff = end - start;
double timeElapsed = chrono::duration <double, milli> (diff).count()/1000.0;
cout << "Encryption time= " << timeElapsed/2 << " s" << endl;
// Rotation
Ciphertext temp;
start = chrono::steady_clock::now();
temp = scheme.rightRotate(ctxt1, 7 );
end = std::chrono::steady_clock::now();
diff = end - start;
timeElapsed = chrono::duration <double, milli> (diff).count()/1000.0;
cout << "Rotation time= " << timeElapsed << " s" << endl;
// Multiplication
start = chrono::steady_clock::now();
scheme.multAndEqual(ctxt1, ctxt2);
scheme.reScaleByAndEqual(ctxt1, logp);
end = std::chrono::steady_clock::now();
diff = end - start;
timeElapsed = chrono::duration <double, milli> (diff).count()/1000.0;
cout << "Multiplication time= " << timeElapsed << " s" << endl;
BGV test:
// m: Cyclotomic polynomial - defines phi(m);
// p: Plaintext prime modulus
// r: Hensel lifting (default = 1)
// bits: Number of bits of the ciphertext modulus chain (Level)
// c: Number of columns of Key-Switching matrix (default = 2 or 3)
Params param(/*m=*/20480, /*p=*/127, /*r=*/1, /*bits=*/200, /*c=*/2); //slots = 256;
/*ea.size()=slots=256*/
vector<long> cmsg1(ea.size());
vector<long> cmsg2(ea.size());
for (int i = 0; i < ea.size(); i++) {
cmsg1[i] = i % 2;
cmsg2[i] = i % 2;
}
// Encryption
auto start= chrono::steady_clock::now();
Ctxt ctxt1(publicKey);
ea.encrypt(ctxt1, publicKey, cmsg1);
Ctxt ctxt2(publicKey);
ea.encrypt(ctxt2, publicKey, cmsg2);
auto end = std::chrono::steady_clock::now();
auto diff = end - start;
double timeElapsed = chrono::duration <double, milli> (diff).count()/1000.0;
cout << "Encryption time= " << timeElapsed/2 << " s" << endl;
//Rotation
start= chrono::steady_clock::now();
rotate(ctxt1, 7);
end = std::chrono::steady_clock::now();
diff = end - start;
timeElapsed = chrono::duration <double, milli> (diff).count()/1000.0;
cout << "Rotation time= " << timeElapsed << " s" << endl;
//Multiplication
start= chrono::steady_clock::now();
ctxt1.multiplyBy(ctxt2);
end = std::chrono::steady_clock::now();
diff = end - start;
timeElapsed = chrono::duration <double, milli> (diff).count()/1000.0;
cout << "Rotation time= " << timeElapsed << " s" << endl;
HEAAN test results (run many times):
Encryption time= 0.0162935 s (supposed to contain the encode time )
------------------
Rotation time= 0.0448905 s
------------------
Multiplication time= 0.0220362 s
BGV test results:
Encryption time= 0.0269653 s
------------------
Rotation time= 0.374534 s
------------------
Multiplication time= 0.164159 s
------------------
I have no idea why you did not at least align the parameter in HElib (m > 20k) but HEAAN (N = 2^13). Please clarity your question (concerns) a more little bit
@fionser Sorry, I'm not familiar with the parameters and detailed implementations in HEAAN, that's a bad test.
Since I want to get the exact number of slots (such as 256, 1024 and 4096), I choose these parameters for HElib. For /*m=*/20480, /*p=*/127, /*r=*/1, /*bits=*/200, /*c=*/2
, I use the program to find the corresponding parameters m=20480
and p=127
, then I get slots=256
.
Then in HEAAN, we can directly set the exact number of slots.
If both matrix, vector are ciphertext, then CKKS is faster because the multiplication of CKKS is faster than BFV. This is algorithmic and we can do little with it.
But if one of operand is plaintext. Thing can be different.
Basically, SEAL-BFV keeps the plaintext (polynomial) in its coefficient form while the SEAL-BFV uses the ntt form.
As a result, the plaintext-ciphertext multiplication for BFV would be slower than CKKS (because BFV's plaintext needed to be converted to its NTT form which is costly). On the other hand, CKKS just shift this work load to the encoding phase. If you put the "encoding" phase and evaluation together, I think CKKS/BFV would give the same performance.
This is an implementation decision.
Hey, @fionser Is there a typo here? I think "SEAL-BFV keeps the plaintext (polynomial) in its coefficient form while the SEAL-CKKS uses the ntt form" is exactly what you mean.
When I test matrix multiplication based on CKKS, it's much faster than BFV (or BGV). Because CKKS generate approximate results, right? So if we want the exact results, CKKS is not suitable, as described in README.md. But can we optimize the performance of rotation? Since in my matrix multiplication test, a certain number of rotations are necessary.