Closed brettviren closed 4 years ago
With google perftools
$ CPUPROFILE=ductor.prof LD_PRELOAD=~/.nix-profile/lib/libprofiler.so wire-cell -c cfg/uboone/simsp/main-simple-quiet.jsonnet
$ google-pprof --gif `which wire-cell` ductor.prof > ductor.gif
This uses 2 ~5m tracks. 10022 depos. 215s. Resulting graph:
With --lines
the entry point to the hot spot is found to be the complex multiplication which is done to calculate the spectrum inside the FFT based convolution.
https://github.com/WireCell/wire-cell-gen/blob/master/src/ImpactZipper.cxx#L87
Since this is pretty much the main thing the Ductor does, it's not surprising it's the culprit.
Tried a hack to replace std::vector
with std::valarray
for this line. Allocated valarrays outside the loop and for each impact, copy from vector, do the arith, copy back. Result: 240s.....
Same amount of time spent inside complex multiply but now the addition valarray operator adds some time. Well, no easy fix, I guess.
This is obsolete since we have the "transform" version.
The very first test of a muon across protoDUNE-SP takes 15 minutes mostly inside the ductor. Maybe this is just life, or maybe it can be sped up.
Possible remedies