ibarrond / Pyfhel

PYthon For Homomorphic Encryption Libraries, perform encrypted computations such as sum, mult, scalar product or matrix multiplication in Python, with NumPy compatibility. Uses SEAL/PALISADE as backends, implemented using Cython.
https://pyfhel.readthedocs.io/
Apache License 2.0
482 stars 78 forks source link

How to rescale the ciphertexts after multiplication #163

Closed SerhatBah closed 1 year ago

SerhatBah commented 1 year ago

Hello Alberto,

I really don't want to bother you unnecessarily, but I just can't get any further. I have to hand in this work soon and only have to encrypt the msr. I'm pretty sure it's due to getting the "rescale" function set correctly. I am writing this as a new Issue, and hope that other people can be helped with this problem as well.

My Code:

HE = Pyfhel()  # Creating empty Pyfhel object
ckks_params = {
    'scheme': 'CKKS',  # can also be 'ckks'
    'n': 2 ** 14,  # Polynomial modulus degree. For CKKS, n/2 values can be
    #  encoded in a single ciphertext.
    #  Typ. 2^D for D in [10, 16]
    'scale': 2 ** 30,  # All the encodings will use it for float->fixed point
    #  conversion: x_fix = round(x_float * scale)
    #  You can use this as default scale or use a different
    #  scale on each operation (set in HE.encryptFrac)
    'qi_sizes': [60, 30, 30, 30, 60]  # Number of bits of each prime in the chain.
    # Intermediate values should be  close to log2(scale)
    # for each operation, to have small rounding errors.
}

HE.contextGen(**ckks_params)  # Generate context for bfv scheme
HE.keyGen()  # Key Generation: generates a pair of public/secret keys
HE.rotateKeyGen()
HE.relinKeyGen()  # Relinearization key generation

data = np.random.randint(0, 5, size=(10, 5))
num_rows, num_cols = data.shape
rows = np.ones(num_rows, dtype=np.bool)
cols = np.ones(num_cols, dtype=np.bool)

 # Encrypting sub_data
        # 1. make sub_data a contiguous array in memory
        # 2. change 2d arrays into 1d
        # 3. Convert plaintext into ciphertext

        enc_sub_data = sub_data.flatten()
        #arr_sub_data = np.empty(len(enc_sub_data), dtype=PyCtxt)
        sub_data = data[rows][:, cols]
        sub_data = np.ascontiguousarray(sub_data).flatten().astype(np.float64)
        arr_sub_data = HE.encryptFrac(sub_data)

        # Encrypting sub_data
        # 1. make sub_data a contiguous array in memory
        # 2. change 2d arrays into 1d
        # 3. Convert plaintext into ciphertext

        sub_data = data[rows][:, cols]
        sub_data = np.ascontiguousarray(sub_data).flatten().astype(np.float64)
        arr_sub_data = HE.encryptFrac(sub_data)
        n_elements = len(rows) * len(cols)

        # Row-wise sum & Encrypting row_means
        enc_rowwise_sum = PyCtxt(copy_ctxt=arr_sub_data)
        for i in range(1, len(cols)):
            enc_rowwise_sum += arr_sub_data << i

        # Column-wise sum & Encrypting col_means
        enc_colwise_sum = PyCtxt(copy_ctxt=arr_sub_data)
        for i in range(1, len(rows)):
            enc_colwise_sum += arr_sub_data << (i * len(cols))

        # Encrypting data_mean
        enc_data_mean = np.sum(arr_sub_data) / len(arr_sub_data)

        # 1. Mean
        enc_row_means = enc_rowwise_sum / len(cols)
        enc_col_means = enc_colwise_sum / len(rows)

        # Encrypting Residues
        enc_residues = arr_sub_data - enc_row_means - enc_col_means + enc_data_mean

        # Encrypting Squared Residues
        enc_squared_residues = enc_residues ** 2

        # Encrypting msr
        enc_msr = np.sum(enc_squared_residues) / len(enc_squared_residues)

        # Encrypting row_msr
        enc_row_msr = enc_squared_residues / len(enc_squared_residues)

        # Encrypting col_msr
        enc_col_msr = enc_squared_residues / len(enc_squared_residues)

        # Decrypting msr
        t_dec0 = time.perf_counter()
        decrypted_msr = enc_squared_residues.decrypt()[:n_elements:len(enc_squared_residues)].round()

        # Decrypting msr_row
        decrypted_msr_row = enc_rowwise_sum.decrypt()[:n_elements:len(cols)].round()

        # Decrypting msr_col
        decrypted_msr_col = enc_colwise_sum.decrypt()[:len(cols)].round()

        return decrypted_msr, decrypted_msr_row, decrypted_msr_col

I have now tried this approach:


        # Encrypting Squared Residues
        enc_squared_residues = enc_residues ** 2

        HE.relinearize(enc_squared_residues)
        HE.rescale_to_next(enc_squared_residues)
        enc_squared_residues = HE.encrypt(enc_squared_residues)
        HE.mod_switch_to_next(enc_squared_residues)

        # Encrypting msr
        enc_msr = np.sum(enc_squared_residues) / len(enc_squared_residues)

        # Encrypting row_msr
        enc_row_msr = enc_squared_residues / len(enc_squared_residues)

        # Encrypting col_msr
        enc_col_msr = enc_squared_residues / len(enc_squared_residues)

And got this error message:

/usr/bin/python3.10 /home/serhat/SeCCA-CKKS/secured_cheng_church_yeast.py 
SeCCA Step 2
Number of the Bicluster:5
Traceback (most recent call last):
  File "/home/serhat/SeCCA-CKKS/secured_cheng_church_yeast.py", line 27, in <module>
    biclustering = secca.run(data)
  File "/home/serhat/SeCCA-CKKS/biclustlib/algorithms/type2.py", line 123, in run
    self._multiple_node_deletion(data, rows, cols, msr_thr, HE, t_enc, t_dec)
  File "/home/serhat/SeCCA-CKKS/biclustlib/algorithms/type2.py", line 170, in _multiple_node_deletion
    msr, row_msr, col_msr = self._calculate_msr(data, rows, cols, HE, t_enc, t_dec)
  File "/home/serhat/SeCCA-CKKS/biclustlib/algorithms/type2.py", line 259, in _calculate_msr
    enc_squared_residues = HE.encrypt(enc_squared_residues)
  File "Pyfhel/Pyfhel.pyx", line 493, in Pyfhel.Pyfhel.Pyfhel.encrypt
TypeError: <Pyfhel ERROR> Plaintext type [<class 'Pyfhel.PyCtxt.PyCtxt'>] not supported for encryption
corrupted double-linked list

Process finished with exit code 1

And I tried this approach:

    # Encrypting Squared Residues
    enc_squared_residues = enc_residues ** 2

    HE.relinearize(enc_squared_residues)
    HE.rescale_to_next(enc_squared_residues)
    enc_squared_residues = HE.encrypt(len(enc_squared_residues))
    HE.mod_switch_to_next(enc_squared_residues)

    # Encrypting msr
    enc_msr = np.sum(enc_squared_residues) / len(enc_squared_residues)

    # Encrypting row_msr
    enc_row_msr = enc_squared_residues / len(enc_squared_residues)

    # Encrypting col_msr
    enc_col_msr = enc_squared_residues / len(enc_squared_residues)

And got this error message (with results):


[...]
Rescaling & Mod Switching.
->  Mean:  <Pyfhel Ciphertext at 0x7f2e252ca840, scheme=ckks, size=2/2, scale_bits=60, mod_level=1>
->  MSE_1:  <Pyfhel Ciphertext at 0x7f2e252cb650, scheme=ckks, size=2/2, scale_bits=60, mod_level=1>
->  enc_msr:  <Pyfhel Ciphertext at 0x7f2e2549aac0, scheme=ckks, size=2/2, scale_bits=60, mod_level=2>
->  enc_row_msr:  <Pyfhel Ciphertext at 0x7f2e25314d10, scheme=ckks, size=2/2, scale_bits=60, mod_level=2>
->  enc_col_msr:  <Pyfhel Ciphertext at 0x7f2e25314cc0, scheme=ckks, size=2/2, scale_bits=60, mod_level=2>
Encryption Time:  5.74204 Seconds
Decryption time:  0.21878 Seconds
Bicluster(rows=[0 1 2 3 4 5 6 7 8 9], cols=[0 1 2 3 4])
Bicluster(rows=[0 1 2 3 4 5 6 7 8 9], cols=[0 1 2 3 4])
Bicluster(rows=[0 1 2 3 4 5 6 7 8 9], cols=[0 1 2 3 4])
Bicluster(rows=[0 1 2 3 4 5 6 7 8 9], cols=[0 1 2 3 4])
Bicluster(rows=[0 1 2 3 4 5 6 7 8 9], cols=[0 1 2 3 4])
Time Performance in Calculating Homomorphically:  6.542958437999914 Seconds
corrupted double-linked list

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

But I think the len(..) function is wrong.

Can you help me rescale my function properly?

ibarrond commented 1 year ago

The len function is a bit misleading (I'm thinking of changing it soon), since it doesn't give away information on the number of encrypted data values, but rather on the number of polynomials in use for the ciphertext structure (>2 means you can use relinearization).

Try removing all the calls to len and replacing them by constants of known value (e.g., num_rows, num_cols)

SerhatBah commented 1 year ago

I really really really thank you a lot for your help. I no longer have any errors in this area.

So now I wanted to run the program on a much larger dataset and I made this change:

from this:


# DATA GENERATION
num_rows, num_cols = 10,5
n_elements = num_rows*num_cols
np.random.seed(42)                           # Fixed seed for reproducibility
data = np.random.randint(0, 5, size=(num_rows, num_cols))   # Generate data

to this:

# load yeast data used in the original Cheng and Church's paper
data = load_yeast_tavazoie().values

# missing value imputation suggested by Cheng and Church
missing = np.where(data < 0.0)
data[missing] = np.random.randint(low=0, high=800, size=len(missing[0]))

And get this error message:

/usr/bin/python3.10 /home/serhat/SeCCA-CKKS/secured_cheng_church_yeast.py 
SeCCA Step 2
Number of the Bicluster:2884
Traceback (most recent call last):
  File "/home/serhat/SeCCA-CKKS/secured_cheng_church_yeast.py", line 24, in <module>
    biclustering = secca.run(data)
  File "/home/serhat/SeCCA-CKKS/biclustlib/algorithms/type2.py", line 123, in run
    self._multiple_node_deletion(data, rows, cols, msr_thr, HE, t_enc, t_dec)
  File "/home/serhat/SeCCA-CKKS/biclustlib/algorithms/type2.py", line 170, in _multiple_node_deletion
    msr, row_msr, col_msr = self._calculate_msr(data, rows, cols, HE, t_enc, t_dec)
  File "/home/serhat/SeCCA-CKKS/biclustlib/algorithms/type2.py", line 230, in _calculate_msr
    arr_sub_data = HE.encryptFrac(sub_data)
  File "Pyfhel/Pyfhel.pyx", line 364, in Pyfhel.Pyfhel.Pyfhel.encryptFrac
  File "Pyfhel/Pyfhel.pyx", line 388, in Pyfhel.Pyfhel.Pyfhel.encryptFrac
ArithmeticError: <Afseal>: Data vector size is bigger than ckks nSlots

Process finished with exit code 1

So it shows the error in this part:

arr_sub_data = HE.encryptFrac(sub_data) and says the Data vector size is bigger than ckks nSlots

I haven't found anything about this in the previous posts, unfortunately. How can I increase the size of the slots in CKKS? Or have I implemented something wrong?

Many many thanks

ibarrond commented 1 year ago

You basically run out of space in your ciphertext. Your only choice is yo adapt the algorithm to work with multiple ciphertexts instead of only one.