Closed lidh15 closed 1 year ago
And of course, even if the results are accurate, intel's python implementation still has the advantage that it is born with float multiplication over heu.phe python implementation.
Hello,
Thank you for your question. Your issue with numerical accuracy is a common problem that arises when using floating-point arithmetic in homomorphic encryption. We recommend using the HEU library's big int encoder to achieve any degree of accuracy required.
To achieve the desired accuracy, we suggest using the following code:
kit = phe.setup(phe.SchemaType.IPCL)
encrypt = kit.encryptor()
decrypt = kit.decryptor()
evaluate = kit.evaluator()
scale = 10 ** 11
encode = phe.BigintEncoder(phe.SchemaType.IPCL)
x = (63 + 11) * 5111.2834
x_encoded = encode.encode(int(x * scale))
ct_x = encrypt.encrypt(x_encoded)
ct_x_2 = evaluate.mul(ct_x, int(x))
y = (32768 - 63) * 1.3872
pt_y = encode.encode(int(y * scale))
ct_z = evaluate.add(ct_x_2, pt_y)
z = encode.decode(decrypt.decrypt(ct_z)) / scale
To explain the reasoning behind using the big int encoder, HEU provides various encoders, and while the float encoder is easier to use, it has an accuracy limit. This is because most machine learning applications can afford some accuracy loss but require faster computations. Therefore, we choose to make the float encoder have an accuracy limit due to other considerations. However, we support any degree of accuracy using HEU's big int encoder.
A value of x * int(x) has 11 decimal places, and to avoid truncating the last decimal place digit, we recommend encoding with a scale of 10 11. Encoding with scale means encoding the integer x int(x) scale. Although a float encoder does the scaling process for the user, it has limited accuracy, and forcing it to encode with a scale of 10 11 may result in numerical overflow. You can try it yourself with phe.FloatEncoder(phe.SchemaType.IPCL, 10 ** 11)
.
Hence, when requiring very accurate results, the float encoder is not the best tool. Instead, we suggest using the big int encoder, but we must handle the scaling ourselves.
Thank you for the great example, and please let us know if you have any further questions.
Hello, @lidh15
I think the result of ((63 + 11) * 5111.2834) * int((63 + 11) * 5111.2834) + (32768 - 63) * 1.3872
is 143061371616.5304
, not 143061371616.53043
. The result of IPCL and Python is wrong, but HEU is right.
In python, 74 * 5111.2834
gives you the answer 378234.97160000005
. But the actual result is 378234.9716
. An error in the binary decimal representation causes the final 0000005
, eventually resulting in 143061371616.53043
. On the other hand, if you evaluate the expression on paper using the school method, the final result should be 143061371616.5304
In [1]: 74 * 5111.2834
Out[1]: 378234.97160000005
In [2]: 378234.97160000005 * 378234 + 45368.376
Out[2]: 143061371616.53043
In [3]: 378234.9716 * 378234 + 45368.376
Out[3]: 143061371616.5304
The @da-niao-dan ‘s method gives you the same result as python because he calculates x = (63 + 11) * 5111.2834
on the Python side, and then sends x to HEU to encryption, so the error has been introduced when using Python to calculate x.
In [6]: x = (63 + 11) * 5111.2834
In [7]: print(x)
378234.97160000005
In [8]: scale = 10 ** 11
In [9]: print(scale)
100000000000
In [10]: print(int(x * scale))
37823497160000008
Oh yes, you are right, I over trusted python's float precision.
As heu.phe integrated intel IPCL, I compared intel's implementation with heu.phe. The following test https://github.com/intel/pailliercryptolib_python/blob/ff77e7ee3b3af402f8fc445672974ed0a5a57a6b/bench/bench_ipcl_python.py#L60 computed a ciphertext times a plaintext and then plus another plaintext. It takes a range of numbers into calculation, and let's take the last item in the array as an example, the function is simplified:
for heu.phe, I did:
The results of
((63 + 11) * 5111.2834) * int((63 + 11) * 5111.2834) + (32768 - 63) * 1.3872
is clearly143061371616.53043, which is also returned by intel IPCL, but with heu.phe, the result is 143061371616.5304, the last digit is truncated. And similar accuracy losses are observed in lots of tests.