thomasorb / orcs

ORCS is an analysis engine for SITELLE spectral cubes.
GNU General Public License v3.0
9 stars 6 forks source link

fit_lines_in_region() takes too much time #9

Closed thomasorb closed 6 years ago

thomasorb commented 6 years ago

SN3 cube, 5 lines (Halpha + NII + SII), 2x2 binning on a 268x359 region i.e. 24000 pixels. It took ~2h with 4 kernels = 1.2s / pixel with a sincgauss model which is ~10 times slower than the old code (4e6 pixels in ~12 hours with 16 kernels i.e. <0.2s / pixel with a sinc model)

aa1

thomasorb commented 6 years ago

Fit time is returned by the fitting function and is obtained with the keyword fit_time. I have compared fitting times with a basic script

axis, spectrum, fit = cube.fit_lines_in_spectrum(919, 893, 3,
                                                 ('[NII]6548', 'Halpha', '[NII]6583',
                                                 '[SII]6716', '[SII]6731'),
                                                 fmodel='sincgauss',
                                                 pos_def='1',
                                                 pos_cov=-500,
                                                 fwhm_def='fixed',
                                                 sigma_def='1',
                                                 sigma_cov=10,
                                                 nofilter=True,
                                                 snr_guess='auto')
print fit['fit_time']

With 5 emission lines, shared velocity, shared broadening, no filter, fmodel=sincgauss and snr_guess='auto' it takes 1.0s to fit.

You must know that whit snr_guess set to 'auto', the fit is done two times, one time with a default SNR and the second time with a SNR computed from the standard deviation of the residual of the first fit.

When we change some of the parameters we get the following fitting time:

In conclusion, there is no difference with the old algorithm but the sincgauss model takes much more time to compute. I will check it it can be optimized but I'm not sure since it comes from the complexity of the Dawson function involved in the calculation: see https://doi.org/10.1093/mnras/stw2315.

thomasorb commented 6 years ago

I've timed the calculation of the simplest form of the sincgauss function and the complete one defined in orb.utils.spectrum.sincgauss1d with respect to the sinc function as it is defined in orb.utils.spectrum.sinc1d :

def sincgauss1d_test(x, h, a, dx, fwhm, sigma):

    width = np.fabs(fwhm) / orb.constants.FWHM_SINC_COEFF
    width /= np.pi

    a = sigma / math.sqrt(2) / width
    b = (x - dx) / math.sqrt(2) / sigma
    dawson1 = special.dawsn(1j * a + b) * np.exp(2.* 1j * a *b)
    dawson2 = special.dawsn(1j * a - b) * np.exp(-2. * 1j * a *b)
    dawson3 = 2. * special.dawsn(1j * a)

    return ((dawson1 + dawson2)/dawson3).real
import timeit
timer = timeit.Timer(stmt='orb.utils.spectrum.sincgauss1d(x, 0, 1, 500, 2, 2)',
                     setup='import numpy as np ;import orb.utils.spectrum ; x = np.arange(1000)')
print timer.timeit(500)
timer = timeit.Timer(stmt='sincgauss1d_test(x, 0, 1, 500, 2, 2)',
                     setup='import numpy as np ;import orb.utils.spectrum ; x = np.arange(1000)')
print timer.timeit(500)
timer = timeit.Timer(stmt='orb.utils.spectrum.sinc1d(x, 0, 1, 500, 2)',
                     setup='import numpy as np ;import orb.utils.spectrum ; x = np.arange(1000)')
print timer.timeit(500)

The simple script above returned