Problem with GC - Githubissues

ArseniyKholod commented 3 months ago

I realized, that our heavy-weight function, like circshift, accumulates memory. In these functions, when a C++ object is allocated, it is saved as a pointer in Julia. These objects are not freed during the application run without explicit GC.gc() calls. It leads to memory overflow.

Today at the HPSC meeting I spoke with Valentin. He explained the reason, thank him very much!

I will try to reproduce it here. I could misunderstood something, so it is only the starting point.

1) When Julia asks to allocate С++ object, a pointer is created and then the C++ object is allocated.

2) Normal mode of GC works in a way, that it looks at recent objects and not at old ones.

3) Our objects for secure FHE are massive, so in the memory heap lightweight pointer will be considered as an old object ("hidden by massive C++ object") and will not be deallocated by GC in normal mode. GC cannot also deallocate a C++ object, because it is not Julia's. This massive C++ object stays in memory until the pointer is deallocated, and only then CxxWrap.jl can finalize it. Also, GC does not see that memory is almost full by C++ objects, it thinks we have still plenty of it, so it does not enter aggressive mode automatically. So GC does not deallocate the pointer -> memory overflow happens.

Solutions: 1) Solution that I use: call explicitly GC.gc() several times to enter aggressive mode, so GC will look not only at recent objects, but also at old ones. Calling GC by yourself is not a good Julia's practice, so it should be avoided.

2) Valentin said, that if it is possible to configure OpenFHE C++ library with a custom allocator, we can choose g_malloc, with this one Julia's GC will be able to see, that C++ objects full the memory and will be able to enter aggressive mode by itself.

So I will start to explore g_malloc and possibility to choose it for OpenFHE C++.

ArseniyKholod commented 3 months ago

Unfortunately, I don't see a possibility to configure OpenFHE to work with the allocator form GLib, so the only way to remove the created C++ objects is to run GC.gc() explicitly for a full scan of the memory heap and, consequently, remove pointers to these object from the old region.

Next idea is to forget about GC and just call reset for all shared_ptrs within wrapped C++ function.

sloede commented 2 months ago

Thanks for the detailed analysis, and thank you to @vchuravy for his input!

@ArseniyKholod Did you try this yet:

Next idea is to forget about GC and just call reset for all shared_ptrs within wrapped C++ function.

And if yes, to what avail?

ArseniyKholod commented 2 months ago

And if yes, to what avail?

Simply calling for example ciphertext.reset() does not help. OpenFHE library utilizes some static objects with saved copies of shared_ptrs, that should be all reset(), I haven't found all of them yet, it is a pretty complex hierarchy of classes, so I'm still in research, but it looks like a lot of manual coding to deallocate every shared_ptr, but if it will work, probably it is better than calling GC in aggressive mode manually, as it is not efficient then...

vchuravy commented 2 months ago

For posterity's sake, I was referring to jl_malloc/jl_free and not g_malloc. One could potentially use something like https://github.com/JuliaLang/julia/pull/55390 to directly manipulate the "apparent heap size". Or run Julia with --heap-size-hint=4G to encourage Julia to run full GC earlier.

Normal mode of GC works in a way, that it looks at recent objects and not at old ones.

Julia's GC is generational. There is an "old" and a "young" generation. During an "incremental" collection, Julia only visits "young" objects, every so often Julia runs a "full" collection. Those are more expensive, since they are a full heap-traversal.

ArseniyKholod commented 2 months ago

For posterity's sake

@vchuravy, thank you for correcting my misunderstanding and for the new idea with --heap-size-hint, perhaps this is the easiest way!

ArseniyKholod commented 2 months ago

I found a possible solution. So what is the problem? GC is generative and does not remove pointers to the large C++ files we use. For example 8B pointer to Ciphertext, that for secure simulation weight around 60MB. So not deleting 8B leads to memory leak in 60 MB.

The idea is to manage all the pointers by ourselves. All objects like Ciphertext, CryptoContext, and so on are managed like shared_ptr. I would like to manually reset all of them.

TODO:

Wrap reset for each type in openfhe-julia, e.g. for ciphertext:

mod.method("reset",
      [](std::shared_ptr<lbcrypto::CiphertextImpl<lbcrypto::DCRTPoly>>& ptr)
      {ptr.reset();});

We must manually delete all shared_ptr, also so-called rvalues, like a+b, where a and b are ciphertexts. Add to each structure in SecureArithmetic.jl new field rvalue to detect temporary objects, e.g. SecureVector:

mutable struct SecureVector{CryptoBackendT <: AbstractCryptoBackend, DataT}
data::DataT
length::Int
capacity::Int
context::SecureContext{CryptoBackendT}
rvalue::Bool

function SecureVector(data, length, capacity, context::SecureContext{CryptoBackendT}, rvalue=false) where CryptoBackendT
    new{CryptoBackendT, typeof(data)}(data, length, capacity, context, rvalue)
end
end

All operators should check, whether they perform actions with lvalues or rvalues, e.g. a+b+c does firstly a+b, this is an rvalue, that is used then in (a+b)+c, so while performing the sum with c, rvalue a+b must be reset. Redefine all arithmetical operations, e.g. multiply

function multiply(sv::SecureVector{<:OpenFHEBackend}, scalar::Real)
cc = get_crypto_context(sv)
ciphertext = OpenFHE.EvalMult(cc[], sv.data[], scalar)
secure_vector = SecureVector(ciphertext, length(sv), capacity(sv), sv.context, true)
if(sv.rvalue)
    OpenFHE.reset(sv.data)
end

secure_vector
end

All operations create rvalues, if it is assigned to a variable it should be converted to lvalue, e.g. a=b+c, define new operator ===, the name can be changed of course:
```
function ===(sv1::SecureVector, sv2::SecureVector)
OpenFHE.reset(sv1.data)
sv1 = sv2
sv1.rvalue = false
end
```
So if a already a ciphertext, use a===b+c instead of a=b+c

All wrapped functions in openfhe-julia, that have a shared_ptr as an argument, create a new instance of this shared_ptr somehow intrinsically in Cxxwrap.jl, that we cannot reset, so rewrap all functions without shared_ptr as an argument.

mod.method("EvalMult", 
     [](lbcrypto::CryptoContextImpl<lbcrypto::DCRTPoly>& cc, lbcrypto::CiphertextImpl<lbcrypto::DCRTPoly>& ciph,double c)
     {return cc.EvalMult(std::make_shared<lbcrypto::CiphertextImpl<lbcrypto::DCRTPoly>>(ciph), c);});

instead of

wrapped.method("EvalMult",
        static_cast<lbcrypto::Ciphertext<lbcrypto::DCRTPoly>
                    (WrappedT::*)(lbcrypto::ConstCiphertext<lbcrypto::DCRTPoly>,
                                  double) const>(&WrappedT::EvalMult));

This requires many changes, but the problem is also serious. For example of ciphertext operation a=(a+b)*c+d we lose a minimum 3 pointers to the CiphertextImpl objects with a total weight of around 180 MB for secure simulations. What if we do it in a loop?...

There are two problems with this solution. 1) It requires a large amount of work, but it seems to be necessary, calling GC.gc() is not what the user probably wants, not to mention the performance of a full GC scan... Nevertheless, I'm able to manage all the changes in some weeks. 2) Usage of === instead of = is not intuitive, but Julia does not give an access to redefine =...

sloede commented 2 months ago

Thanks a lot for the detailed analysis. If I understand correctly, it boils down to the following:

In OpenFHE.jl's data types, we carry around references to C++ std::shared_ptr objects
The shared pointers point to large OpenFHE C++ objects (tens of megabytes)
Since the OpenFHE objects are very small, GC is not very eager to clean them up
A lot of memory is not freed up for a long time

Did I get it right thus far?

Then, my next question is whether manually calling GC.gc() would fix this memory issue as well?

For example, if you put GC.gc() at the end of each loop (or also at intermediate positions) effectively free up the memory again? Or is there another issue that I don't see?

Also, what is the computational cost of a call to GC.gc() compared to, say, an FHE multiplication? Maybe it is not too expensive?

ArseniyKholod commented 2 months ago

Yes, you are right.

Calling GC.gc() resolves this issue. I call it in every loop iteration for OpenFHEBackend:

if u.context.backend isa OpenFHEBackend
  GC.gc()
end

The call of full GC is comparable with a single ciphertext-scalar multiplication, but ~10 times faster than ciphertext-ciphertext multiplication for 128-bit security.

On Rocinante the single call of GC takes ~0.08 sec, so for 100 steps of simulations, it takes an additional ~8 sec, when the simulation itself >~1000 sec. So yeah, GC.gc() performance here is not a big issue for 128-bit security, just IMHO not typical for Julia manual memory releases.

In cases of running simulation with some smaller ring dimensions, like 2^12, the run time of GC.gc()will be significant ~10-20%, but it is a non-production application without sufficient security level.

Note: The GC.gc() call resolves only the described issue with memory leak, but to remove heavy-weight (~several GB) pre-computed evaluation keys during a single Julia REPL run (needed if several applications with different CryptoContext's are running in a row without exiting the REPL in-between), I created an additional PR https://github.com/hpsc-lab/openfhe-julia/pull/61

sloede commented 2 months ago

Hm. Then I think it would be better for now to add a (prominent) note the the README and/or docs of OpenFHE.jl and SecureArithmetic.jl to inform users about this issue, and that they need to take care of it manually for now.

Alternatively: @vchuravy Is it possible to manually tell the GC that an object is "heavy", such that it is evicted as early as possible from GC?

ArseniyKholod commented 2 months ago

Hm. Then I think it would be better for now to add a (prominent) note the the README and/or docs of OpenFHE.jl and SecureArithmetic.jl to inform users about this issue, and that they need to take care of it manually for now.

I added in README docs an explanation of how to deal with mentioned memory issues: https://github.com/hpsc-lab/OpenFHE.jl/pull/64 and https://github.com/hpsc-lab/SecureArithmetic.jl/pull/44

sloede commented 2 months ago

Thanks. This can be closed then?

hpsc-lab / SecureArithmetic.jl

Problem with GC #45