intel / pailliercryptolib_python

Intel Paillier Cryptosystem Library is an open-source library which provides accelerated performance of a partial homomorphic encryption (HE), named Paillier cryptosystem, by utilizing Intel® IPP-Crypto technologies on Intel CPUs supporting the AVX512IFMA instructions. The library is written in modern standard C++ and provides the essential API for the Paillier cryptosystem scheme. Intel Paillier Cryptosystem Library - Python is a Python extension package intended for Python based privacy preserving machine learning solutions which utilizes the partial HE scheme for increased data and model protection.
Apache License 2.0
56 stars 12 forks source link

cipher computation float accuracy lost #29

Closed lidh15 closed 1 year ago

lidh15 commented 1 year ago

the native IPCL has been python wrapped by different libraries and I compared this implementation with secretflow's heu. I found that results of benchmark test varied among different libraries and at first I believe that this repo should be correct. however, heu developer showed some evidences that they are correct and this repo is not returning accurate results: https://github.com/secretflow/heu/issues/52

lidh15 commented 1 year ago

in a nutshell, the ciphertext computation results of this repo are exactly equal to raw python, which in fact is not accurate. image

fangxiaoran commented 1 year ago

@lidh15 Thanks for your finding! As you referred, it's an issue from Python float precision. We will see how to improve IPCL accordingly and let you know asap.

fangxiaoran commented 1 year ago

Hi @lidh15, one reason for this issue is the accuracy of floating-point arithmetic in Python. 74 * 5111.2834 = 378234.97160000005, as you mentioned, which can't be solved by IPCL.

However, through your test data, I found the encryption function does lead to loss of accuracy sometimes. Since we scale floating-point number to fixed point number automatically, with a limitation of the size of scaled number. The scale factor sometimes is not big enough. You can try

x = 378234.9716
ct_x = pk.encrypt(x)
print(sk.decrypt(ct_x))  # 378234.97160000005

This is fixed by PR https://github.com/intel/pailliercryptolib_python/pull/32. Now we can set precision manually to choose proper scale factor when doing encryption. Try again

x = 378234.9716
ct_x = pk.encrypt(x, precision=10**11)
print(sk.decrypt(ct_x))  # 378234.9716

Thanks again for your finding and feedback! It‘s really helpful for us to improve IPCL.

lidh15 commented 1 year ago

thank you, we will set that argument as we need. But I wonder if a larger scale will lead to a smaller plain text range? I guess after all the integer Paillier handles with has an upper bound.

fangxiaoran commented 1 year ago

But I wonder if a larger scale will lead to a smaller plain text range?

Yes, you are right.

Basically the bottleneck is in Python side. We need to convert the integer (after scaling) to float when decoding. So this integer should not be larger than sys.float_info.max. The computation of ciphertext also affects.

lidh15 commented 1 year ago

thank you. I guess nowadays most of the systems are 64bit and that value is large enough. Though overflow is almost impossible, the accuracy is still limited (for example sys.float_info.max + 1 == sys.float_info.max is True in python), but maybe it doesn't matter otherwise it will be fixed.