Closed henry2004y closed 5 months ago
After #128, the GC performance improves quite a bit:
D:\Research\MHD-AEPIC>julia -t 1 trace_cleanup.jl
[ Info: Number of threads: 1
[ Info: Number of particles: 100000
[ Info: Tracing trajectories...
7.650034 seconds (1.04 M allocations: 999.937 MiB, 4.10% gc time, 7.33% compilation time)
7.661742 seconds (546.69 k allocations: 974.943 MiB, 6.56% gc time, 2.32% compilation time)
>julia -t 2 trace_cleanup.jl
[ Info: Number of threads: 2
[ Info: Number of particles: 100000
[ Info: Tracing trajectories...
7.297970 seconds (1.04 M allocations: 1003.599 MiB, 3.94% gc time, 6.74% compilation time)
4.980314 seconds (546.75 k allocations: 973.270 MiB, 4.75% gc time, 3.98% compilation time)
>julia -t 4 trace_cleanup.jl
[ Info: Number of threads: 4
[ Info: Number of particles: 100000
[ Info: Tracing trajectories...
7.424457 seconds (1.04 M allocations: 1003.394 MiB, 2.67% gc time, 6.58% compilation time)
3.157897 seconds (544.65 k allocations: 813.760 MiB, 4.77% gc time, 9.59% compilation time)
When I tried to run the Boris pusher using multithreading, I had no speed up (actually it was a slowdown) mostly due to drastically increasing GC time. Here is a demo using ChunkSplitters.jl:
When running with 1 thread:
When running with 2 threads:
Maybe reducing the allocations inside
trace_trajectory
would help?