MarzioMonticelli / python-cryptonet

A Python implementation of CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy
MIT License
29 stars 0 forks source link

What is the best value of (t) to get the highest accuracy? #5

Open Aljanabi1981 opened 2 years ago

Aljanabi1981 commented 2 years ago

Hello sir, hope you doing okay.. I run the code but it gives me around 10% precent accuracy, I inserted t = 1099511922689 and the Model Accuracy: 10.07080078125% (825/8192) My Question is what is the proper value of t in order to get a higher accuracy

The work is implemented in google colab Regards,

lhx0217 commented 2 years ago

Me too! Python gives me RuntimeWarning: overflow encountered in long_scalars res += (arr[i]) (t_coef[i]) (t_prod[i]) Maybe there is something wrong with the usage of CRT?

Aljanabi1981 commented 2 years ago

@lhx0217 well i couldn't manage to run the program correctly i started from scrtach and i tried to build the keras model (98%), then again from scratch due to the atomic constructon of the HE (PlainNet) and acc. is the exact same as keras model, then (EncodNet) is also coded with 3% degradation, yet i build the (EncryptedNet) the accuracy Gives me 10%, i think the t value is the problem and i don't know if there is a tool to give me the best t value so that decreption will not fail. one thing i noticed is that the values of the images must stay in between -t/2 to t/2 at least that what the original paper keep insisting and the values of the output in Dense2 exceeds the range.

lhx0217 commented 2 years ago

I've generated 2 bigger t, [100000000049153, 100000000147457],but still only 10%

Aljanabi1981 commented 2 years ago

@lhx0217 the values are correct where t value is large prime numbers and t-1 is a multiplication of 2^n but as you just said the accuracy is still 10%. I even tried to decrease the depth of the CNN but it didn't give much help.. I've seen a paper implementing colse enough idea with the help of pyfhel and CNN you can find it in MDPI open access named "BFV-Based Homomorphic Encryption for Privacy-Preserving CNN Models". I will keep trying and if you try any thing or any ideas please share, thanks in advance.

lhx0217 commented 2 years ago

@Aljanabi1981 when t is 1099512004609 the noise budget of the ciphertext from the conv layer is 85, but it goes to 0 after the first fully connected layer. For 100000000147457, the noise budget is 113, 46, 0, 0. To increase the noise budget, the best approach would be to remove a fully connected layer, i'm trying now.

lhx0217 commented 2 years ago

@Aljanabi1981 I‘ve tried to remain only 2 fully connected layers (the layers can't be less because the square_activation is necessary), but still wrong. If there is only conv layers the network maybe correct.

Aljanabi1981 commented 2 years ago

@lhx0217 i already tried to make only 1 FC layer by removing the square activation which is not correct but to see if it affects the decryption and it fails. Now for the FC layers it must be there as you mentioned. but, what I've seen from the noise budget you mentioned if you want to increase the noise budget you need to specify a larger q value and since we are using bfv scheme it automatically select the highest value based on the security level value. so, how to choose approperiate parameters.txt

1- what is the n value you are using? try to increase 2- what is the sec. value you specified? try to increase 192 (increse the noise budget) but you have to know it adds more computational complexity, meaning more time. please do post the noise budget

Have a look of some of the info. i've collected from the attached file specially in "Params Setting (3-)" in .txt attached file try to modify and if the noise accumilation is slower (use the 15 digit prime number of t if you can bigger). check if this also helps: https://morfix.io/sandbox

Regards,

lhx0217 commented 2 years ago

@Aljanabi1981
I've got the right result. I selecet 1000000000000196609 as t and simplify the network structure, here is the result:

================================ [ -6709547877 -8105650014 -9321869915 63524215405 -172528288037 28796323803 -108006299400 81867210004 -39281783329 1090381266] Model Accuracy: 80.06591796875% (6559/8192)

The network structure is:

    1) Convolutional Layer
    model.add(keras.layers.Conv2D(5, kernel_size=params['kernel_size'], strides=params['strides'], input_shape=x_train[0].shape, activation=square_activation))

    2) Flatten Layer
    model.add(keras.layers.Flatten())

    if params['use_dropout']:
        model.add(keras.layers.Dropout(params['dropout'], input_shape=(100,)))

    3) Sigmoid / Softmax Activation Function
    model.add(keras.layers.Dense(10, activation=params['last_activation']))

    sgd = keras.optimizers.SGD(lr=params['learning_rate'], momentum=params['momentum'], nesterov=params['nesterov'])
    # 告知训练时用的优化器、损失函数和准确率评测标准
    model.compile(optimizer=sgd,
                  loss=params['loss'],
                  metrics=['accuracy'])
Aljanabi1981 commented 2 years ago

@lhx0217 Oh.. Excellent effot, I noticed the number of trainable parameters is (2,510) for this architecture. I am a nive when it comes to python, i would like to ask and have some help if possible: 1- what is the way that you use to generate t value so that t-1 is multiplcation of 2^n? 2- can you help with the EncryptedNet module reconstruction from scratch (JUST THE LAYERS)? Ofcourse its based on the Encoded and PlainNet modules with additional modifications.. i have attached model building from scratch in .txt file. EncryptedNet.txt 3- The code in the GitHub when run ask for path: how to run the code just the procedures.. I really appreciate your help, thanks in advance.

lhx0217 commented 2 years ago

cryptonet1.3.zip @Aljanabi1981 Here is the code, I upgrade the Pyfhel version into 3.2.1.

lhx0217 commented 2 years ago

密码学算法.zip You can generate any t in Miller_Rabin.py and just need to change the range. What's more there is a little error in the cryptonet1.3, you need to change precision into 3(it's 5 now).

Aljanabi1981 commented 2 years ago

@lhx0217 i already instaled the version 3.2.1 but when i run it it hits an error saying "Too large number to convert to c long" something like this but i tried to implement it in google colab and it worked without this error.. The specification of the computer is good enough. in google colab it only installs Pyfhel 2.3.1.

lhx0217 commented 2 years ago

@Aljanabi1981 In Millar_Rabin.py, ku1.bigmod(8192,1,i)==1 ensures that i = 1(mod 8192), so t is correct. I changed some of the code in this Github, most of which are the same. If it hits the error "Too large number to convert to c long", you can change t into a smaller prime.

Aljanabi1981 commented 2 years ago

@lhx0217 okay i tried to run the Millar_Rabin.py and it tells me (ModuleNotFoundError: No module named '质因数分解') and (import 求大整数模幂运算 as ku1) tried pip install 质因数分解 but the repository does not have such package (also 求大整数模幂运算) the question is there some repository or GitHub for it?

Aljanabi1981 commented 2 years ago

@lhx0217 Hello sir, i managed to get the correct values of (t) based on miller with the proper condition of (t-1) is multiplication of 2^n, also i changed the precision value to 3 as you mentioned. i have seen that you stopped the batching process (SIMD) through the CRT and the codes in encryptednet but it keeps giving me [ValueError: Buffer dtype mismatch, expected 'int64_t' but got 'long'] i checked the input and it's dtype is int 64? what am I missing her? Regards.

kcsmta commented 1 year ago

Does the code ignore square layers after dense layers? I think this is the reason for the low accuracy.