ITensor / ITensorParallel.jl

Parallel tools for ITensors.jl.
MIT License
21 stars 3 forks source link

Add support for forcing the GC when using Distributed #19

Closed mtfishman closed 1 year ago

mtfishman commented 1 year ago

This adds support for forcing Julia's garbage collector to run during certain operations when using the Distributed backend.

@nbaldelli was reporting that Julia was crashing when using the Distributed backend of ITensorParallel.jl with larger MPS bond dimensions. It may be related to issues with Julia not running the garbage collector properly when using Distributed parallelization, for example see https://discourse.julialang.org/t/from-multithreading-to-distributed/101984/6.

Currently by default it triggers the garbage collector to run within remote calls to applying the effective Hamiltonian, changing the position, etc. when there is less than 6GB of memory left on the process. This default can be changed with:

ITensorParallel.set_gc_gb_threshold!(3)

where the value is in GB. Setting it very large (i.e. ITensorParallel.set_gc_gb_threshold!(Inf)) means it always gets triggered in the operations where it is hard-coded in this PR, while setting it very small (i.e. ITensorParallel.set_gc_gb_threshold!(0)) means Julia will handle all garbage collection based on its default behavior.

@b-kloss @awietek

codecov-commenter commented 1 year ago

Codecov Report

Merging #19 (070c530) into main (79215c4) will increase coverage by 2.45%. Report is 3 commits behind head on main. The diff coverage is 85.93%.

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main      #19      +/-   ##
==========================================
+ Coverage   67.97%   70.42%   +2.45%     
==========================================
  Files           9       10       +1     
  Lines         281      328      +47     
==========================================
+ Hits          191      231      +40     
- Misses         90       97       +7     
Files Changed Coverage Δ
src/ITensorParallel.jl 100.00% <ø> (ø)
src/mpisumterm.jl 81.57% <0.00%> (ø)
src/force_gc.jl 66.66% <66.66%> (ø)
src/foldssum.jl 58.18% <81.81%> (+10.95%) :arrow_up:
src/distributedsum.jl 97.14% <96.87%> (+3.39%) :arrow_up: