This optimizes the sim_tree function for large N. The proposed implementation uses a Set object to store nodes while the original one uses an array, which is slow to delete random elements because the complexity is linear to the number of elements. Please note that a few minor cleanups are also included in this change, such as removing unused variables.
main:
julia> @benchmark sim_tree(n = 100_000)
BenchmarkTools.Trial: 2 samples with 1 evaluation.
Range (min … max): 2.936 s … 2.950 s ┊ GC (min … max): 1.83% … 2.44%
Time (median): 2.943 s ┊ GC (median): 2.13%
Time (mean ± σ): 2.943 s ± 9.674 ms ┊ GC (mean ± σ): 2.13% ± 0.43%
█ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
2.94 s Histogram: frequency by time 2.95 s <
Memory estimate: 136.45 MiB, allocs estimate: 3792429.
this pull request:
julia> @benchmark sim_tree(n = 100_000)
BenchmarkTools.Trial: 18 samples with 1 evaluation.
Range (min … max): 235.876 ms … 380.271 ms ┊ GC (min … max): 19.88% … 49.79%
Time (median): 283.760 ms ┊ GC (median): 33.86%
Time (mean ± σ): 288.197 ms ± 45.947 ms ┊ GC (mean ± σ): 34.51% ± 10.18%
█ ▃ ▃ ▃
█▇▇▁▁▁▁▁▇█▁▁▁▁▇▁▁▁▁▁▁▁▁▁▁▁▇▁▁█▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▇▇▇▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
236 ms Histogram: frequency by time 380 ms <
Memory estimate: 149.99 MiB, allocs estimate: 4299296.
This optimizes the
sim_tree
function for large N. The proposed implementation uses aSet
object to store nodes while the original one uses an array, which is slow to delete random elements because the complexity is linear to the number of elements. Please note that a few minor cleanups are also included in this change, such as removing unused variables.main:
this pull request: