subbundle extractions for >2x speedup

sbailey commented 7 years ago

This PR speeds up spectral extractions by ~2.5x by implementing divide-and-conquer in the spectral direction in addition to the previously implemented wavelength direction. e.g. to exact a 25 fiber bundle with 3 sub-bundles, it would

extract 0-8, keep 0-7
extract 7-16, keep 8-15
extract 15-24, keep 16-24

i.e. subbundles extract an extra fiber on each side which is discarded, except for the subbundles at the edge of the bundle which don't need the extra fiber on the edge side.

subbundle-speed

Comparing the sub-extractions to a full extraction, there are differences at the 0.02 sigma level. e.g. the following plot shows noise-weighted difference between 25 fibers x 200 wavelengths extracted in a single shot (flux1) vs. 6 subregions of 12-13 fibers x 50 wavelengths instead (flux2). The right hand plot is (flux1-flux2)/sqrt(1/ivar1 + 1/ivar2). divide-and-conquer-residuals

Although 0.02 sigma is tiny, the difference can grow dramatically once the extraction wavelength step size gets small compared to the PSF size, i.e. <0.7 Angstrom. That is also the range where numerical instabilities introduce ringing in the extractions such that the noise model becomes invalid, i.e. subbundle extractions don't create any new constraints on the extraction wavelength grid.

sbailey commented 7 years ago

Further notes: this implements subbundle extraction in the specter toolkit. After this is merged I'll make an update to desi_extract_spectra to actually use that option with desispec. If really needed I could make a companion desispec pull request, but those are always a pain. In the meantime, this can be tested on cori with:

Setup desi code plus specter subbundle branch by whatever your favorite method is, e.g.

source /project/projectdirs/desi/software/desi_environment.sh master
module unload specter
git clone https://github.com/desihub/specter
cd specter
git checkout subbundle
export PYTHONPATH=$(pwd)/py:$PYTHONPATH

Then within ipython

from specter.extract import ex2d
from specter.psf import load_psf
import desispec.io

psf = load_psf('/project/projectdirs/desi/spectro/redux/oak1/calib2d/psf/20160714/psfnight-r0.fits')
img = desispec.io.read_image('/project/projectdirs/desi/spectro/sim/oak/20160714/pix-r0-00000006.fits')

ww = np.arange(7000, 7200, 1)
%time flux1, ivar1, Rdata1 = ex2d(img.pix, img.ivar, psf, specmin=0, nspec=25, wavelengths=ww, nsubbundles=1) 
%time flux5, ivar5, Rdata5 = ex2d(img.pix, img.ivar, psf, specmin=0, nspec=25, wavelengths=ww, nsubbundles=5)

weighted_flux_diff = (flux1 - flux5) / np.sqrt(1/ivar1 + 1/ivar5)
print( np.min(weighted_flux_diff), np.max(weighted_flux_diff) )

On cori07 I only got a 2x speed improvement not 2.5x on my laptop, but that was also competing with other users and has a different processor. This example with an oak PSF also has a maximum weighted flux diff of O(0.1 sigma) rather than the O(0.02 sigma) I found with toy examples from a desimodel PSF. That's a bit bigger than I wish but I don't think is disastrous. Independent opinions are welcome.

If @julienguy wants to do comparison extractions for full frames before merging, then adding the option to desi_extract_spectra probably would be the best route instead of spending hours within ipython. Let me know.

sbailey commented 7 years ago

Julien and I looked at this together; he requested an additional test where a bright object (e.g. stdstar) is right on the border of a subbundle. I added this case to the notebook, showing that the biases are still minimal as long as you have the 1-fiber padding that gets discarded per bundle.

desihub / specter

subbundle extractions for >2x speedup #51