alanderos91 / BioSimulator.jl

A stochastic simulation framework in Julia.
https://alanderos91.github.io/BioSimulator.jl/stable/
Other
47 stars 7 forks source link

Performance issues, memory leak, high cpu usage? #44

Open alext-extracellular opened 1 month ago

alext-extracellular commented 1 month ago

Hi there, I am not sure if this is because I am using the package wrong, but when I try and make the most simple possible script to run a simulation, I am getting very high memory and CPU usage, and it seems to hang and not produce any results. When I had save_points set to nothing, the memory usage climbed to 5.9 GB and then it got OOM killed. I tried with save_points set as below, and this took me to 533 MB and 12.4% CPU usage, and stayed there for a while before I killed it as well. Any tips would be appreciated!

 using BioSimulator, Plots

  # initialize
  network = Network("BDI")

  # species definitions; add components with <=
  network <= Species("X", 5)

  # reaction definitions
  network <= Reaction("birth", 2.0, "X --> X + X")
  network <= Reaction("death", 1.0, "X --> 0")
  network <= Reaction("immigration", 0.5, "0 --> X")
  state, model = parse_model(network)
  result = simulate(state, model, EnhancedDirect(),
             tfinal = 100.0,
             save_points = 0:10:100)

  plot(result, summary = :trajectory,
      xlabel = "time", ylabel = "count",
      label = ["X"])
alanderos91 commented 1 month ago
  1. Does this work for you with tfinal=4.0? For reference, this is what I see on my machine:
    result = @time simulate(state, model, EnhancedDirect(),
                   tfinal = 4.0,
                   save_points = 0:0.1:4);
    #0.000048 seconds (1.19 k allocations: 64.422 KiB)
  2. OOM is not too surprising without save_points. The process is supercritical and the number of events increases rapidly as X increases. So there's a lot of saving taking place in the background.
  3. The EnhancedDirect() method is going to be quite slow once the number of events over any interval $[t, t+s)$ blows up. Something like HybridSAL() can speed up the simulation if you're willing to sacrifice some accuracy here. However, it is suspicious that even $\tau$-leaping is slow here. For tfinal=4.0 we see sample paths like
    result = @time simulate(state, model, HybridSAL(),
                   tfinal = 40.0,
                   save_points = 0:10:40)
    # 0.000131 seconds (2.32 k allocations: 122.391 KiB)
    # t: 5-element Vector{Float64}:
    #  0.0
    # 10.0
    # 20.0
    # 30.0
    # 40.0
    # x: 5-element Vector{Vector{Int64}}:
    # [5]
    # [162074]
    # [3517801165]
    # [75666188025181]
    # [1561806170670646473]

    I suspect we are encountering Int64 overflow after t=40. If you really need to account for huge compartments like that, then it would be possible to modify most of the algorithms to handle BigInt at the cost of some performance.

In either case, BioSimulator should handle overflow behavior better than this.

alext-extracellular commented 1 month ago

Thank you very much, I hadn't considered how large the evolution would get. I was able to run some shorter ones just fine. I noticed that it seems that once you get to the point were the number gets too large, julia hangs, possibly indefinitely, but not sure. Is this to do with the int64 overflow? It is likely in my real simulations that we don't need anything that large however. Thanks for the tips, happy to close this for now, unless you think the overflow needs to be addressed.