Quickbrick performance - Githubissues

dkirkby commented 8 years ago

Thanks to the redshift data challenge, we now have a first version of quickbrick with validated outputs. This is a good time to benchmark the performance for different template classes and see where we stand:

Is the simulation fast enough to meet our current needs?
What are the bottlenecks, if any?

dkirkby commented 8 years ago

For a quick first look, time the commands used to generate the 100-object single-class DC files (taken from the log files in datachallenge/zdc1/samples/training/XXX):

Class	t(100)
QSO	14.7s
ELG	18.8s
LRG	136.4s
STAR	11.8s
MIX	40.1s

These are simply unix time results running from commit b86a8fad on my laptop. When only generating 100 objects, the fixed overhead from reading the template files might not negligible, but we can already see that LRGs are significantly slower to simulate than the other classes.

moustakas commented 8 years ago

My guess is this is almost all due to the velocity dispersion broadening, followed by the interpolation onto the desired output wavelength array, which is slightly slower for the LRGs because the "basis" templates have smaller pixels than the other spectral types.

The other place the templates code is slower than it could be in principle is the fact that I have to loop on each template in order to determine if the object passes the nominal color cuts. Some speedup could possibly happen by vectorizing in some way this part of the code.

dkirkby commented 8 years ago

Running the LRG case through the profiler, about 70% of the time is spent in gauss_blur_matrix from desisim.pixelsplines so I think that confirms your guess. I see that calls to this same function are commented out for the ELG case.

dkirkby commented 8 years ago

@moustakas It looks like you could significantly speed up the LRG simulation if you used the same velocity dispersion for all galaxies, so that the blur matrix can be pre-computed once in the constructor instead of repeatedly for each simulated galaxy. The default parameters have log10(vdisp) = 3.0 +/- 0.1, so this boils down to the question of whether that 0.1 smearing is worth a ~9x slowdown.

On the topic of adding velocity dispersion, it looks like you are applying the blur matrix using

flux *= blur

but this is equivalent to

flux = flux * blur

while I think you actually want:

flux = blur * flux

moustakas commented 8 years ago

Thanks for pointing out that I wasn't applying the blurring matrix correctly, which I obviously hadn't realized. I'll fix this in my branch.

As for the prior on the velocity dispersion (and note that the default log10(sigma) mean and sigma values I picked are 2.3+/-0.1, which gives a typical value of 200 km/s and a peak-to-peak range of ~100-350 km/s), I think it's pretty important that we include this variation. The velocity dispersions of the LRGs we're going to observe will certainly vary (although I should look at some actual SDSS-IV/eBOSS data on this question). For now I'll include a nominal value as you propose (better than nothing!) while we figure out a speedier way to do this. (Although if it takes a little longer to make the templates it's not a terrible thing...)

sbailey commented 8 years ago

Let's keep the slower version that has variations in the velocity dispersion. The template generation speed isn't our limiting factor. I'd rather prioritize realism over speed right now.

dkirkby commented 8 years ago

I agree that we don't need to address this yet. The main point of this issue is to identify and keep an eye on the "tall poles", in anticipation of the day when we will care about how fast the simulations run.

moustakas commented 6 years ago

I'd like to close this long-open issue. Performance is reasonably fast when generating templates to scale (via desitarget/bin/select_mock_targets), plus we have the --no-spectra option for generating even larger datasets quickly.

If anyone disagrees please feel free to reopen.

desihub / desisim

Quickbrick performance #58