Closed JaydevSR closed 1 year ago
The current interface seems not friendly to implement other types of replica exchange schemes like pH replica exchange and general Hamiltonian replica exchange simulations.
However I think the temperature data should be stored at the level of the simulator rather than at the level of the system, in line with the other thermostats. In this case that would mean removing references to temperature in these two structs and having a
TemperatureREMD
simulator or similar, which would contain temperature data in its struct along with the thermostat to use. What do you think?
I was thinking about this as well as we will need different "sub" simulators for each replica with different coupling temperature for temp-REMD, so this will be better. We still need to maintain common indices between the simulator replicas and the system replicas so that there is no ambiguity. I will look into simplifying this.
I have added the supporting functions to the ReplicaSystem
type in addition to the suggested edits. Many function for System
and ReplicaSystem
are identical. Should I use a union type for that to simplify the code? Next I am also working of a simulator type TemperatureREMD
and a method of simulate!
for that.
Merging #64 (b82d1c7) into master (7cf63d6) will decrease coverage by
0.28%
. The diff coverage is81.75%
.
@@ Coverage Diff @@
## master #64 +/- ##
==========================================
- Coverage 85.26% 84.98% -0.29%
==========================================
Files 31 31
Lines 3190 3323 +133
==========================================
+ Hits 2720 2824 +104
- Misses 470 499 +29
Impacted Files | Coverage Δ | |
---|---|---|
src/types.jl | 76.83% <73.68%> (-3.87%) |
:arrow_down: |
src/simulators.jl | 96.49% <88.70%> (-2.91%) |
:arrow_down: |
src/loggers.jl | 95.97% <100.00%> (+0.28%) |
:arrow_up: |
src/chain_rules.jl | 46.80% <0.00%> (-5.32%) |
:arrow_down: |
src/zygote.jl | 77.84% <0.00%> (+0.29%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
Good changes. Yes you can use a union type, possibly we could introduce an abstract type later.
I think the AtomsBase interface requires that the index given to position
and velocity
is the atom index, so the replica index plus atom index used here would be confusing. Possibly use the first replica for now in these functions?
For making the simulation, I first define a simulation type:
struct TemperatureREMD{N, T, S, SD}
temps::StaticVector{N, T}
indices::StaticVector{N, Int}
simulators::SD
end
A constructor is defined separately for this type that I am omitting here right now. Then a new methods for the simulate!
function:
function simulate!(;
sys::ReplicaSystem,
sim::TemperatureREMD,
n_cycles::Int,
cycle_length::Int;
parallel=true
)
if sys.n_replicas != length(sim.indices)
error("Number of replicas in ReplicaSystem and simulators in TemperatureREMD must match.")
end
for cycle=1:n_cycles
for idx in sim.indices
simulate!(sys.replicas[idx], sim.simulators[idx], cycle_length; parallel=parallel) #flag1
end
if cycle != n_cycles
n = rand(1:sys.n_replicas)
m = mod(n, sys.n_replicas) + 1
β_n, β_m = 1/sim.temps[n], 1/sim.temps[m]
V_n, V_m = potential_energy(sys.replicas[n]), potential_energy(sys.replicas[m]) #flag2
Δ = (β_m - β_n)*(V_n - V_m)
if Δ <= 0 || rand() < exp(-Δ)
# exchange coordinates
sys.replicas[n].coords, sys.replicas[m].coords = sys.replicas[m].coords, sys.replicas[n].coords
# scale velocities
sys.replicas[n].velocities *= sqrt(β_n/β_m)
sys.replicas[m].velocities *= sqrt(β_m/β_n)
end
end
end
end
Note: Right now the replica simulation is not in parallel.
The main issue here is that sys.replicas[i]
has type IndividualReplica
for which many of other function will need new methods, but if it had the type System
then all of this will work without any problem and we don't need to make any changes to other functions (#flag1
and #flag2
) or define new methods. But if we make all the replicas their own systems, then there is issue of redundant information in all of them. Any suggestions?
Reference for T-REMD algorithm that I am using: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6484850/
The simulator outline looks good. In order to follow the simulate!(sys, sim, n_steps)
convention it might be better to store an exchange_time
in TemperatureREMD
and calculate n_cycles
and cycle_length
in simulate!
.
You raise an interesting point about types. I think the most flexible approach might just be to have multiple System
s. This opens up the possibility of setting up the systems differently too, which some people might find use for. Longer term each replica would need a different box_size
anyway (barostats change the box size) and I think each replica should have its own neighbour finder since CellListMapNeighborFinder
stores some neighbour info in the struct.
That leaves the atoms
, atoms_data
and interaction fields as duplicated, but since arrays are stored by reference each System
can just point to the same memory for this. You could still have ReplicaSystem
which just holds a list of System
s and does relevant setup in the constructor.
How does that sound?
(Accidentally clicked close sorry)
The simulator outline looks good. In order to follow the simulate!(sys, sim, n_steps)
convention it might be better to store an exchange_time
in TemperatureREMD
and calculate n_cycles
and cycle_length
in simulate!
.
You raise an interesting point about types. I think the most flexible approach might just be to have multiple System
s. This opens up the possibility of setting up the systems differently too, which some people might find use for. Longer term each replica would need a different box_size
anyway (barostats change the box size) and I think each replica should have its own neighbour finder since CellListMapNeighborFinder
stores some neighbour info in the struct.
That leaves the atoms
, atoms_data
and interaction fields as duplicated, but since arrays are stored by reference each System
can just point to the same memory for this. You could still have ReplicaSystem
which just holds a list of System
s and does relevant setup in the constructor.
How does that sound?
For the constructor, in order to support multiple sub simulators and couplings, I am having to write multiple checks:
function TemperatureREMD(;
simulator, exchange_time, temps, dt,
kwargs...)
S = simulator
T = eltype(temps)
N = length(temps)
ET = typeof(exchange_time)
temps = SA[temps...]
indices = SA[collect(1:length(temps))...]
coupling_required = false
if simulator ∈ [VelocityVerlet, Verlet, StormerVerlet]
coupling_required = true
coupling = kwargs[:coupling]
end
if coupling_required && coupling ∈ [AndersenThermostat, BerendsenThermostat]
couplings = [coupling(temps[i], kwargs[:coupling_const]) for i in indices]
elseif coupling_required && coupling ∈ [RescaleThermostat]
couplings = [coupling(temps[i]) for i in indices]
elseif coupling_required
error("Temperature coupling required for TemperatureREMD using $(simulator) simulator.")
end
if simulator==Langevin
simulators = Dict(
[i, simulator(dt=dt, temperature=temps[i], friction=kwargs[:friction],
remove_CM_motion=kwargs[:remove_CM_motion])]
for i in indices
)
elseif simulator ∈ [VelocityVerlet, Verlet]
simulators = Dict(
[i, simulator(dt=dt, temperature=temps[i], coupling=couplings[i],
remove_CM_motion=kwargs[:remove_CM_motion])]
for i in indices
)
elseif simulator == StormerVerlet
simulators = Dict(
[i, simulator(dt=dt, temperature=temps[i], coupling=couplings[i])]
for i in indices
)
else
error("Simulator type $(simulator) not supported.")
end
SD = typeof(simulators)
return TemperatureREMD{N, T, S, SD, ET}(temps, indices, simulators, exchange_time)
end
But when new simulators are added, this function will also need to be changed. This does not seem good.
I agree it doesn't scale well. This could just be left to the user for now, i.e. the constructor takes a list of simulators as an argument and checks that it is the same length as the list of temperatures. The docs would make clear that the simulators should be set up with the appropriate coupling/temperatures.
I was looking a bit into how to parallelize the whole replica simulations and would like to make a few points about that:
parallel=true
versions of other functions use Threads.@threads
macro, which is statically scheduled. So, If I use Threads.@threads
for parallelizing replica simulation, then only one thread will be available for those functions.@sync
and @spawn
for all the parallelization, this will change:
Threads.@threads for i=1:n
do_some_work()
end
to:
@sync for i=1:n
Threads.@spawn do_some_work()
end
Threads.@thread
macro is going to change. And it will be dynamically scheduled by default. See: https://github.com/JuliaLang/julia/blob/v1.8.0-rc1/NEWS.md#multi-threading-changes. So maybe these changes won't be necessary after the release of v1.8?I think target Julia 1.8 behaviour, it may be that we drop support for Julia 1.7 soon anyway depending on advances in the AD landscape. The parallel behaviour here doesn't have to mirror the existing behaviour though since it is more complex.
As you suggest it would be useful to be able to run each simulation on a certain number of threads, e.g. 16 cores could run 4 replicas on 4 cores each. Is this possible using the threads setup? The current parallel
keyword argument may be insufficient here, I have considered changing it to n_threads
across the whole package previously. Would that help?
Also, sorry for that large breaking change recently. Hopefully there will be fewer breaking changes as the right API emerges!
I have added the simulator for TemperatureREMD. Right now I am using the parallel
kwarg but I agree with you that its usage in this case is not ideal.
The current @sync
and Threads.@spawn
method chooses a free thread to perform one cycle of the simulation and say if there are 4 replicas then 4 threads will be used leaving the other threads free for the other parallel calls. But it is not ensured that all the threads are distributed equally among all the replicas (which may be problematic if the number of nested parallel calls is close to the number of free threads due to which workload may not be evenly balanced).
About using using an n_threads
argument, in that case we can make use of the Threads.foreach()
function (see: https://docs.julialang.org/en/v1/base/multi-threading/#Base.Threads.foreach) which seems useful in our case.
Also, I performed some initial simulations (akin to the Leonard Jones gas example in docs) to check the sanity of the simulator and here are some results, for 4 replicas of the system at 100 K, 105 K, 110 K and 115 K:
Still I am a bit skeptical about the replica exchange part and I will check it once again to ensure that I am not doing something wrong.
Yes I think let's switch to n_threads
then, could you do that? It would involve replacing parallel=false
with n_threads=1
and parallel=true
with n_threads=default_n_threads
throughout, where default_n_threads
is defined alongside default_float
as Threads.nthreads()
. Then change the relevant logic in forces
.
With regards to testing I would use a large temperature range, at least 50 K difference per replica, and have an exchange time slow enough to see temperature equilibration. You could use a Langevin integrator or a lower Andersen coupling constant if it is struggling to equilibrate.
@jgreener64 Okay, I will try making these changes. Also thanks for suggestions on test simulations.
It seems that making these changes using just the base library will be a bit more of a trouble than we need to handle. Instead JuliaFolds/FLoops.jl, already supports such a change in number of threads utilized. See: https://juliafolds.github.io/data-parallelism/howto/faq/#set-nthreads-at-run-time. This library will also support much more different flavors of thread based parallelization (via JuliaFolds/FoldsThreads.jl) which can come in handy for load balancing. Should I use this library if adding it as a dependency won't be a problem?
Taking the dependency is fine so yes go ahead and try it out if it provides functionality beyond base Julia. Long term I think we will change parallel
to n_threads
anyway though.
I guess using FLoops.jl may also allow the CPU forces
path https://github.com/JuliaMolSim/Molly.jl/blob/3bed491a9d183492262d5ec323a0f509923e51f8/src/force.jl#L221 to be written more concisely.
Okay then I will experiment with a bit and make a PR if I am able to get it to work as expected.
I ran the simulator on Leonard-Jones gas and it seem to work without any problems. What possible tests can I make that will better highlight the replica exchange at work that this basic simulation? After this I can write some unit tests to improve coverage.
Great. I think a Lennard-Jones simulation is okay for now but it would be good to see the graphs you had previously with a clearer temperature profile.
It also would be useful to log when the exchanges took place, then compare the number of exchanges over a simulation to an expected number. Perhaps the ReplicaSystem
could have a loggers
field, with the default being a new ReplicaExchangeLogger
that records the time step, the indices exchanged and Δ
for each exchange.
On the topic of loggers, I think ReplicaSystem
shouldn't store replica_loggers
in its struct
. replica_loggers
should be used in the constructor as is, then the loggers can be accessed with replica_system.replicas[i].loggers
.
I performed a test simulation on 3 replicas of Leonard-Jones gas on 8 threads, using DistanceNeighborFinder
and a Langevin
simulator.
temp_vals = [120.0u"K", 160.0u"K", 200.0u"K"]
simulator = TemperatureREMD(
dt=0.005u"ps",
temperatures=temp_vals,
simulators=[
Langevin(
dt=0.005u"ps",
temperature=tmp,
friction=0.1u"ps^-1",
)
for tmp in temp_vals],
exchange_time=2.5u"ps",
);
The exchanges were logged with the help of the new ReplicaExchangeLogger
and we have the following exchanges in form of (step, exchanged indices)
after simulating for 20_000
steps:
(9500, (2, 3))
(12500, (1, 2))
(15500, (1, 2))
(17000, (3, 1))
(18000, (2, 3))
(18500, (2, 3))
These are plots for temperatures of each replica and the x-coordinates of the one of the atoms in these replicas:
Temperatures:
Coordinates:
Edit: Changed the plots after fixing the velocity exchange
Benchmarks
System
with 1000 atoms and DistanceNeighborFinder
, LeonardJones
interactions using Langevin
simulator for 5000 steps (2 threads):
77.253948 seconds (582.38 k allocations: 4.966 GiB, 0.87% gc time)
ReplicaSystem
with 4 replicas of above system with same simulator and steps (8 threads):
173.750242 seconds (2.18 M allocations: 20.255 GiB, 2.47% gc time, 0.45% compilation time)
Ratio of times: 2.25
I did profiling on the simulate!
function and the bottleneck seems to be the part where the replicas are being exchanged. Most of the part there is type unstable. I will try to make it type stable and see if there are any improvements.
It seems that I misinterpreted the profiles before. The actual issue with the drop in performance seems to do with task scheduling. Here are the flame graph of the profiles:
In the second image we can see this more clearly, more than 60% of time is spent in the wait()
in the call stack of find_neighbors!
. This as far as I can think should be because of clashing scheduling of @floops
and @spawn
. I think I may be able to fix this by using one of the different executors for the parallel loop in find_neighbors!
. See: https://juliafolds.github.io/FoldsThreads.jl/dev/#FoldsThreads.WorkStealingEx. I will see if this works.
So I experimented a bit to see if there can be an improvement or not. After trying different executors and a pure julia implementation also, the case with 2 threads 1 system and 8 threads 4 replicas, gives no improvement. But for a 1 thread 1 system and 4 threads 4 replicas, the ratio was improved a bit to about 1.6, as my machine supports at most 8 threads so I think the wait time was high in the first case because all threads were getting used. So, maybe it is best to use less than the maximum available threads for the replica simulation to avoid such clashes.
But still julia v1.8 will bring some changes to multithreading, so it will be interesting to see if that gives any improvement for nested threaded loops.
Nice one :tada:
Thinking about next steps, it might be worth considering how to extend this to non-temperature REMD simulators. The system type and the logger are already independent of temperature. Would it be possible to modify the simulate!
function to take in a function that runs the lines
if dimension(sys.energy_units) == u"𝐋^2 * 𝐌 * 𝐍^-1 * 𝐓^-2"
k_b = sys.k * T(Unitful.Na)
else
k_b = sys.k
end
T_n, T_m = sim.temperatures[n], sim.temperatures[m]
β_n, β_m = 1/(k_b*T_n), 1/(k_b*T_m)
neighbors_n = find_neighbors(sys.replicas[n], sys.replicas[n].neighbor_finder; n_threads=n_threads)
neighbors_m = find_neighbors(sys.replicas[m], sys.replicas[m].neighbor_finder; n_threads=n_threads)
V_n, V_m = potential_energy(sys.replicas[n], neighbors_n), potential_energy(sys.replicas[m], neighbors_m)
Δ = (β_m - β_n)*(V_n - V_m)
should_exchange = Δ <= 0 || rand(rng) < exp(-Δ)
return should_exchange, Δ
and determines whether an exchange should take place? Then simulators such as TemperatureREMD
can be a particular case of a generic REMD
simulator that takes this function as one of its arguments, and the simulation code can be reused.
Yay 🎊 .
Your suggestion seem excellent. This will help in generalizing the simulation code. I will go ahead and make these additions. Also, I think I can try implementing the Hamiltonian-REMD next, that would require the replicas to have different potentials. So, I will also need to make some changes to ReplicaSystem
to allow for this.
Includes:
ReplicaWrapper
andReplicaSystem
for handling replica exchange simulations.ReplicaSystem
is equivalent toSystem
but containing multiple replicas at different temperatures.ReplicaWrapper
is a wrapper for a particular replica.