Closed Lightup1 closed 2 years ago
Started from a performance test.
using QuantumOptics, BenchmarkTools, LinearAlgebra, MKL using FFTW # FFTW.set_provider!("mkl") # FFTW.set_provider!("fftw") FFTW.set_num_threads(6) ## b1 = PositionBasis(-1, 1, 2^14) b2 = MomentumBasis(b1) ## Tpx_test = QuantumOptics.transform(b2, b1) ppsi = Ket(b2,rand(ComplexF64,length(b2))) psi = Ket(b1,rand(ComplexF64,length(b2))) @benchmark QuantumOpticsBase.mul!($ppsi, $Tpx_test, $psi) ## p1=plan_fft(rand(ComplexF64,2^14)) data1=rand(ComplexF64,2^14) data2=rand(ComplexF64,2^14) @benchmark mul!($data2,$p1,$data1)
1 thread:
BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 81.200 μs … 1.018 ms ┊ GC (min … max): 0.00% … 0.00% Time (median): 87.300 μs ┊ GC (median): 0.00% Time (mean ± σ): 92.110 μs ± 28.820 μs ┊ GC (mean ± σ): 0.00% ± 0.00% ▂▇█▅█▆▆▅▅▄▃▃▂▂▁▁ ▁ ▂ █████████████████████▇▇█▇██▇█▇██▇▇█▇▇▇▇▇▇▇▆▅▆▆▃▆▆▅▄▄▂▅▂▃▃▃▄ █ 81.2 μs Histogram: log(frequency) by time 159 μs < Memory estimate: 0 bytes, allocs estimate: 0.
6 thread:
BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 57.700 μs … 769.900 μs ┊ GC (min … max): 0.00% … 0.00% Time (median): 75.000 μs ┊ GC (median): 0.00% Time (mean ± σ): 74.873 μs ± 10.677 μs ┊ GC (mean ± σ): 0.00% ± 0.00% ▁ ▂▄ ▃▆█▁ ▁▁▂▂▁▁▁▁▂▁▁▁▁▂▇█████▆▄▄▄▆█████▆▅▄▅▅▇▆▅▃▃▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃ 57.7 μs Histogram: frequency by time 96.6 μs < Memory estimate: 0 bytes, allocs estimate: 0.
The odd thing is that the pure vector fft is much faster than Ket fft.
BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 23.600 μs … 800.300 μs ┊ GC (min … max): 0.00% … 0.00% Time (median): 35.600 μs ┊ GC (median): 0.00% Time (mean ± σ): 36.250 μs ± 8.544 μs ┊ GC (mean ± σ): 0.00% ± 0.00% █▂ █▂▆▄ ▁▁▁▁▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▃▆██▇████▇▇▄▃▃▃▄▃▅▅▄▄▃▃▂▂▂▁▂▁▁▁▁▁▁ ▂ 23.6 μs Histogram: frequency by time 45.2 μs < Memory estimate: 0 bytes, allocs estimate: 0.
Checked the code I think it may caused by the scaling operation. I'll close the issue.
Started from a performance test.
1 thread:
6 thread:
The odd thing is that the pure vector fft is much faster than Ket fft.