microsoft / SEAL

Microsoft SEAL is an easy-to-use and powerful homomorphic encryption library.
https://www.microsoft.com/en-us/research/group/cryptography-research/
MIT License
3.61k stars 709 forks source link

Are GaloisKeys or Ciphertexts PODs? If not, could you make them? #233

Closed ReverseControl closed 3 years ago

ReverseControl commented 4 years ago

Is there any way I can write raw bytes of GaloisKeys or Ciphertexts to memory, and later just get a pointer, typecast it, and get my galoiskeys and ciphertext back intact without having to serialize/de-serialize? The serialization is really killing me.

kimlaine commented 4 years ago

In principle a ciphertext is pretty simple: a dynamic heap-allocated byte array and a couple of fields. It would certainly be possible to merge all of the fields into the same buffer and have one big byte array representing the data. I know this could be convenient in some scenarios, but serialization (without compression, and soon with Zstandard compression) is pretty fast compared to many of the HE operations. What aspect of serialization is most problematic for you at the moment? Maybe there is a way we could fix that instead?

A public key is the same as a ciphertext, and Galois/relinearization keys are collections of ciphertexts, so any change to ciphertext should immediately apply to these classes as well.

ReverseControl commented 4 years ago

Essentially, I am reusing ciphertexts/GaloisKeys/RelinKeys a lot....like a lot...but, functions go in and out of scope, so I need to store some ciphertexts in memory often -- I am restricted to a C interface. If it was a POD type, I could use a pointer, typecast it and save a tonne of computation and memory accesses traversing kilobytes of data to decompress once and then reading again to do the actual operations. That is a cost that scales with my computations and is totally unnecessary.

My problem with serialization is that it exists at all -- not true, that's a bit extreme, but its the only available way to store ciphertexts/galoiskeys/relinkey types into memory ...I don't need it, it hinders performance; like, the hindering scales with the amount of computation I do, which makes it infinitely worse as I process more data. I want simple POD types, fixed sizes, that can be type casted to compute stuff.

If it was a struct, I just mem copy the bytes into a block of memory, keep the pointer around, and every time i need to use data in that struct, I just type cast it (which costs like two instructions, instead of thousands like decompressing) and access members as normal. Almost no overhead -- literally a few instructions.

I must mention, compression does nothing for my use case. Memory is not a bottleneck at all and I have plenty of it.

Incidentally:

   cout << "Is GaloisKeys a POD? " << std::is_pod< GaloisKeys >::value << endl;    
    cout << "Is Ciphertext a POD? " << std::is_pod< Ciphertext >::value << endl;    
    cout << "Is GaloisKeys standard layout? " << std::is_standard_layout< GaloisKeys >::value << endl;    
    cout << "Is Ciphertext standard layout? " << std::is_standard_layout< Ciphertext >::value << endl;    
    cout << "Is GaloisKeys trivially copyable? " << std::is_trivially_copyable< GaloisKeys >::value << endl;
    cout << "Is Ciphertext trivially cipyable? " << std::is_trivially_copyable< Ciphertext >::value << endl;

Returns:

Is GaloisKeys a POD? 0
Is Ciphertext a POD? 0
Is GaloisKeys standard layout? 1
Is Ciphertext standard layout? 1
Is GaloisKeys trivially copyable? 0
Is Ciphertext trivially cipyable? 0
kimlaine commented 4 years ago

I see. Could you just allocate the ciphertexts or key objects from the heap and pass those pointers around?

seal::Ciphertext *my_ct = new seal::Ciphertext();
// use it and pass around wherever.
delete my_ct;

If you want to use SEAL purely from C, that's possible as well with the SEAL_C wrapper library, which is almost complete in terms of functionality. It was not really meant to be used directly (we used it to build the C# wrappers), so you may have to replicate some of the work we did for C# in C, but it's all possible and shouldn't be too much work unless you really need the complete SEAL API available.

ReverseControl commented 4 years ago

No.

I need to allocate a block of memory, and place the ciphertext/relinkeys/galois in that block of memory I allocated; new will allocate its own thing somewhere and in C land I dont want to know what happens over time with memory alloc/dealloc with new/delete nor do I want to find out. Not to mention the Ciphertext class itself has its own memory pool manager. I need to place ciphertexts, relin keys, and GaloisKeys in a struct and be able to treat it like legit POD; i need the typecasting to avoid serializing/deserializing and to avoid unnecessary reads, but also to keep track of memory to avoid leaks over time and unnecessary headaches between C/C++ subtleties when it comes to memory management.

I want to be able to manage the memory where these objects are placed because I don't know what will happen in C land, also because over time I can reuse the stuff I allocate magically without having to alloc/dealloc unnecessarily with new/delete. Memcopy is a great simultaneous destructor/constructor; I would just overwrite stuff, no allocations needed. To do so I need a POD.

In short, if I get to allocate the memory with, say malloc, and then place the ciphertext/relin-keys/galoiskeys objects in that block of memory as a POD in any order I see fit, then yes.

kimlaine commented 4 years ago

To answer your question: Unfortunately the data structures in SEAL are really far from POD and I don't think there is any chance to change the library so much to make that happen.

SEAL uses its own memory manager to ensure reuse of allocated memory; this works well because the allocations are often of only very specific sizes. I'm still not sure I understand what headaches you would expect from differences in C and C++ memory allocation. Are you writing your own HE implementation in C that would be able to operate on SEAL objects? Otherwise I don't see the problem of allocating memory in C++ and using it again in C++ (in SEAL function calls).

ReverseControl commented 4 years ago

The POD would be read only. It would never be used as an output; there would never be any writes to it. I just need to be able to convert a Ciphertext into a POD so that I may use it as input for future calculations.

My suggestion above takes care of the internal memory allocation; that is, it would be unnecessary as once a POD is created it would never change size or need reallocation or fancy pointers pointing outside the POD itself; pointers for internal structures should be at an offset of the beginning of the POD into the right "member".

WeiDaiWD commented 3 years ago

Objects in SEAL such as Ciphertext, PublicKey, RelinKeys, and GaloisKeys do require resizing. The data_ member of a Ciphertext is a pointer to a memory occupied by a variable number of polynomials whose sizes can change. Without resizing, memory consumption in SEAL goes up, and many memcpy calls should be made instead of one.