homenc / HElib

HElib is an open-source software library that implements homomorphic encryption. It supports the BGV scheme with bootstrapping and the Approximate Number CKKS scheme. HElib also includes optimizations for efficient homomorphic evaluation, focusing on effective use of ciphertext packing techniques and on the Gentry-Halevi-Smart optimizations.
https://homenc.github.io/HElib
Other
3.11k stars 761 forks source link

Serializing Parameters #207

Open deevashwer opened 6 years ago

deevashwer commented 6 years ago

Is there an option in HElib to serialize the FHEcontext class after building the parameters? I am working with parameters that take a lot of time to get initialized. Is there another way I can get around this problem?

fionser commented 6 years ago

To write out all the data associated with a context, do the following: \code writeContextBase(str, context); str << context; \endcode

deevashwer commented 6 years ago

Thanks @fionser for your answer. I did this to serialize my parameters, but the problem is that it is taking almost as much time as the original computation took. Initializing a FHEcontext object, even after giving gens and ords, takes a lot of time. After that I add the serialized information about the modulo chain, which doesn't take much time. The problem is that initializing the FHEcontext object takes a lot of time, even after serialization.

fionser commented 6 years ago

Use

friend void readContextBase(istream& str, unsigned long& m, unsigned long& p, unsigned long& r,
                  vector<long>& gens, vector<long>& ords);

Do not initialize FHEcontext directly! It will take too many time to find the generators and calculate the orders, which have been serialized already

So your code should like this

long m, p, r;
std::vector<long> gens, ords;
readContextBase(in_stream, m, p, r, gens, ords); // just read the gens and orders from file
FHEcontext context(m, p, r, gens, ords); // not FHEcontext(m, p, r) !!!!
in_stream >> context; // restore the remaining stuff from the file
deevashwer commented 6 years ago

That's exactly what I did, as I mentioned above. I initialized it with gens and ords as well, yet it is taking too much time. Almost as much as the original computation took to generate gens and ords.

deevashwer commented 6 years ago

So, I benchmarked the performance for a parameter setting that took 356 seconds to initialize the FHEcontext object the first time (without providing ords and gens). Code: FHEcontext context(m, p, r)); Then, I serialized the parameters. This time it took 335 seconds, even after reading the Context base, and providing the same to the FHEcontext object. It took almost the same amount of time. Code: FHEcontext context(m, p, r, gens, ords)); contextFile >> context; This totally defeats the purpose of serialization.

fionser commented 6 years ago

@deevashwer how about providing your m and p and L ?

deevashwer commented 6 years ago

My parameters are: p=(7593119, 7593127, 7593133), r=2, m=65536, l=8, c=2 (I have instantiated 3 instances of FHEcontext)

The part that is taking a lot of time is the initialization of FHEcontext object. So, I assume that only m, p and r are relevant. This is the code that I'm using. (T and P have interchangeably been used to denote the plaintext modulus).

bool contextFlag = false;
    fstream contextFile("context.txt", fstream::in);
    unsigned long M[FACTORS_T], P[FACTORS_T], R[FACTORS_T], L, C, T, T_i;
    contextFile >> M[0];
    contextFile >> R[0];
    contextFile >> L;
    contextFile >> C;
    contextFile >> T;
    contextFile >> T_i;
    if(contextFile.is_open() && M[0] == m && R[0] == r && L == l && C == c && T == FACTORS_T && T_i == BITSIZE_FACTORS_T)
        contextFlag = true;
    vector<long> gens[FACTORS_T], ords[FACTORS_T];
    if(contextFlag){
        for(int i = 0; i < FACTORS_T; i++)
            readContextBase(contextFile, M[i], P[i], R[i], gens[i], ords[i]);
        #pragma omp parallel for
        for(int i = 0; i < FACTORS_T; i++)
            context[i] = new FHEcontext(M[i], P[i], R[i], gens[i], ords[i]);
        for(int i = 0; i < FACTORS_T; i++)
            contextFile >> *context[i];
        contextFile.close();
    }
    else{
        contextFile.close();
        fstream contextFile("context.txt", fstream::out|fstream::trunc);
        contextFile << m << endl;
        contextFile << r << endl;
        contextFile << l << endl;
        contextFile << c << endl;
        contextFile << FACTORS_T << endl;
        contextFile << BITSIZE_FACTORS_T << endl;
        #pragma omp parallel for
        for(int i = 0; i < FACTORS_T; i++){
            context[i] = new FHEcontext(m, t[i], r);
            buildModChain(*context[i], l, c);
        }
        for(int i = 0; i < FACTORS_T; i++)
            writeContextBase(contextFile, *context[i]);
        for(int i = 0; i < FACTORS_T; i++)
            contextFile << *context[i] << endl;
        contextFile.close();
    }
shaih commented 6 years ago

The serialization/deserialization of context in HElib is mostly an afterthought. The use case that I had in mind was that the context is initialized "at the beginning of time" and stays fixed thereafter. Serialization/deserializations is written to ensure that you can send a context across the wire and get the same thing on the other end, it was not written to be efficient. Most structures are re-computed rather than saved and read back. Specifically the FHEcontext constructor re-initalizes from scratch the underlying PAlgebra, PAlgebraMod, and EncryptedArray, each of which consists of computing many interpolation coefficients etc.

If anyone has the time to put in re-writing it, let me know and I'll be happy to give you some guidance on what needs to be changed and how.

deevashwer commented 6 years ago

I am currently dealing with a use case where I am using HElib, without leveraging its slots. I'm dealing with a polynomial plaintext directly, so I used the encryption and decryption methods of publicKey and secretKey directly, to save encoding and decoding time. Is there a way I can initialize parameters quickly, since most of the computation done by FHEcontext is not being used by me?

fionser commented 6 years ago

@shaih In my understanding, the PAlgebra.cpp do many table generation work, so we can serialize these tables (i.e., genMaskTable and genCrtTable) ?

shaih commented 6 years ago

Yes, both PAlgebra and PAlgebraMod have many tables, which can be serialized.

In PAlgebra the tables are T, Tidx, zmsIdx, zmsRep, native, and also PhimX. (Probably the most expensive to initialize is the table T, then zmsIndx and PhimX, once these are set it should not take very long to intialize the others, so maybe it is enough to record only these tables.)

In PAlgebraModDerived the tables are factors, crtCoeffs, maskTable, crtTable, crtTree, I'm not sure which of them is the most important one to initialize, we can put some timers in the constructor to see what takes the longest to compute.

The EncryptedArray object also has some tables, but these take a lot faster to initialize.

fionser commented 6 years ago

@shaih, I am working on this. If I can get a reasonable form without too many modification, I will send a PR.

shaih commented 6 years ago

@fionser, I just merged an implementation of "binary I/O", which saves on bandwidth when you serlialize/deserialize stuff.

This is mostly orthogonal to the issue of re-computing all these tables, but if you find a good solution for the ascii I/O then we should also copy it to the binary case.

fionser commented 6 years ago

@shaih It seems I need to modify too much codes of FHEcontext, PAlgebra to serialize the tables, e.g., creating a new dummy constructor of PAlgebra, and to restore the tables from file.

I might need to rethink about how to do the that.

By the way, the current binary IO dump data as little endian format, so would it be fine to work on big-endian machine ?