heltonmc / LightPropagation.jl

Modeling light transport in turbid media
MIT License
9 stars 3 forks source link

Excessive allocations for derivatives layered media in spatial domain #21

Open heltonmc opened 2 years ago

heltonmc commented 2 years ago

The number of allocations is dependent on the variable you take the derivative with respect to. It is ok with respect to z so the flux calculations are not affected by this however for gradient calculations this is majorly affected... This could be a regression from when we switched over to tuples but unsure

julia> @benchmark fluence_DA_Nlay_cylinder_CW(0.1, (0.1, 0.1), (10.0, 10.0), 1.0, (1.0, 1.0), (1.0, 9.0), 10.0, 0.0, 1000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  101.500 μs … 322.333 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     105.875 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   105.726 μs ±   5.836 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▁▇▇▆      ▅▅▄█▇▄▂▁▁▁▁▂▃▃▂▂▂▃▂▁                               ▂
  ██████▇▆▇▇███████████████████████▆▇▆▆▇▅▅▆▅▇▆▇▅▇▆▅▆▅▅▃▆▃▁▆▅▅▄▅ █
  102 μs        Histogram: log(frequency) by time        120 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.derivative(dz -> fluence_DA_Nlay_cylinder_CW(0.1, (0.1, 0.1), (10.0, 10.0), 1.0, (1.0, 1.0), (1.0, 9.0), 10.0, dz, 1000), 0.0)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  103.500 μs … 193.750 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     104.209 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   104.654 μs ±   2.356 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

     ▃█▇▇▆▅▂                ▃▂▂▁                                ▂
  █▄▁███████▇▆▆▄▄▃▃▆▆▆▆▄▅▄▁██████▇▄▃▆▆▅▄▆▆▇█▇▆▆▅▄▄▄▄▃▁▃▃▄▃▄▃▄▅▅ █
  104 μs        Histogram: log(frequency) by time        111 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.derivative(dz -> fluence_DA_Nlay_cylinder_CW(0.1, (dz, 0.1), (10.0, 10.0), 1.0, (1.0, 1.0), (1.0, 9.0), 10.0, 0.0, 1000), 0.1)
BenchmarkTools.Trial: 7685 samples with 1 evaluation.
 Range (min … max):  593.875 μs …   3.413 ms  ┊ GC (min … max): 0.00% … 79.90%
 Time  (median):     607.083 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   649.008 μs ± 304.502 μs  ┊ GC (mean ± σ):  5.96% ± 10.01%

  █▃                                                            ▁
  ██▇▆▄▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▅█ █
  594 μs        Histogram: log(frequency) by time       3.02 ms <

 Memory estimate: 765.67 KiB, allocs estimate: 29002.
heltonmc commented 2 years ago

The problem seems to come within the kernel function but not green...

julia> @benchmark ForwardDiff.derivative(dz->LightPropagation._green_Nlaycylin_top(1.0, (0.1, 0.1), (0.4, 0.4), dz, 0.1, (1.2, 1.2), (1.0, 1.0), (1.0, 1.0), 2), 0.0)
BenchmarkTools.Trial: 10000 samples with 981 evaluations.
 Range (min … max):  61.714 ns … 99.388 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     62.011 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   62.243 ns ±  1.413 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁▃▅█▆▄▆▄▁                             ▁▂                    ▂
  █████████▄▃▃▄▄▁▃▄▁▃▃▃▃▄▃▄▃▄▁▁▁▄▅▄▄▃▃▅▇██▇▇█▅▃▃▄▁▃▃▁▄▁▁▄▄▄▅▅ █
  61.7 ns      Histogram: log(frequency) by time      66.3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.derivative(dz->LightPropagation._green_Nlaycylin_top(1.0, (dz, 0.1), (0.4, 0.4), 0.0, 0.1, (1.2, 1.2), (1.0, 1.0), (1.0, 1.0), 2), 0.1)
BenchmarkTools.Trial: 10000 samples with 984 evaluations.
 Range (min … max):  58.604 ns … 154.726 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     58.943 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   59.409 ns ±   2.693 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▃         ▁▃                                               ▁
  ███▇▆▅▆▆▆▇▆▆███▆▄▆▇▇▇▆▆▅▅▄▁▄▄▁▁▄▃▄▄▅▄▄▅▅▅▅▅▆▅▄▅▅▆▆▅▄▄▄▅▅▅▄▆▅ █
  58.6 ns       Histogram: log(frequency) by time      72.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.derivative(dz->LightPropagation._kernel_fluence_DA_Nlay_cylinder(1.0, (0.5, 0.5), (0.1, 0.1), 20.0, (0.2, 0.2), dz, 0.1, (1.0, 9.0), (1.0, 1.0), J0_ROOTS[1:1000], LightPropagation._green_Nlaycylin_top, 2), 0.0)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  110.875 μs …  1.196 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     113.875 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   115.214 μs ± 13.444 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▂▂▄    ▅█      ▁▃                                            
  ▂███▄▂▂▅███▄▃▂▂▆██▆▃▃▂▃▂▃▃▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  111 μs          Histogram: frequency by time          129 μs <

 Memory estimate: 7.94 KiB, allocs estimate: 1.

julia> @benchmark ForwardDiff.derivative(dz->LightPropagation._kernel_fluence_DA_Nlay_cylinder(1.0, (0.5, 0.5), (dz, 0.1), 20.0, (0.2, 0.2), 0.0, 0.1, (1.0, 9.0), (1.0, 1.0), J0_ROOTS[1:1000], LightPropagation._green_Nlaycylin_top, 2), 0.1)
BenchmarkTools.Trial: 7622 samples with 1 evaluation.
 Range (min … max):  596.375 μs …   3.752 ms  ┊ GC (min … max): 0.00% … 82.18%
 Time  (median):     609.312 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   654.571 μs ± 330.757 μs  ┊ GC (mean ± σ):  6.39% ± 10.12%

  █▃                                                            ▁
  ██▇▅▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▅█ █
  596 μs        Histogram: log(frequency) by time       3.24 ms <

 Memory estimate: 773.62 KiB, allocs estimate: 29004.
heltonmc commented 2 years ago

Culprit function attached... probably due to the preallocation of phi with eltype of mua https://github.com/heltonmc/LightPropagation.jl/blob/e0e6723042af3205bd4a68de84f7fce10f49c233/src/forwardmodels/Diffusion%20Approximation/DAcylinder_layered.jl#L299-L312

probably no reason to preallocate phi_tmp here

Edit: It is not from the preallocation though that does create a type instability

heltonmc commented 2 years ago

Specifically its this line

 ϕ_tmp = green(besselroots[ind] * apzb, μa, D, z, z0, zb, l, n_med, N)