GBTAmmoniaSurvey / GAS

observing scripts and files related to the GBT Ammonia Survey (GAS, PI: Jaime E Pineda & Rachel Friesen)
MIT License
6 stars 13 forks source link

Baseline fitter is not working... problem with numpy? #105

Closed jpinedaf closed 8 years ago

jpinedaf commented 8 years ago

I tried to run the imager for L1455, but it fails at working with the first file already. This is something that has not changed from previous implementations, so I don't understand what changed... except the version of numpy that we are using (1.10.4)

import GAS
In [1]: import GAS
In [2]: GAS.run_grid_regions.grid_L1455()
You will image the GBT Ammonia Survey data for L1455
Now processing /lustre/pipeline/scratch/GAS/L1455/L1455_NH3_11/Perseus_map_L1455-S_scan_134_148_window0_feed1_pol1_sess20.fits
This is file 1 of 70
|=========>-------------------------------------------------------------------------------------------------------------------------------------|  94 /1.4k (  6.67%) ETA    25s---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-78723037947e> in <module>()
----> 1 GAS.run_grid_regions.grid_L1455()

/lustre/pipeline/scratch/GAS/python/GAS/GAS/run_grid_regions.pyc in grid_L1455(release)
    195         dirname=region_name+'_NH3_11',
    196         startChannel = startChannel, endChannel = endChannel,
--> 197         Sessions=mySessions, file_extension=file_extension)
    198 
    199     hd_temp=fits.getheader(region_name+'_NH3_11'+file_extension+'.fits')

/lustre/pipeline/scratch/GAS/python/GAS/GAS/gridregion.pyc in griddata(pixPerBeam, templateHeader, gridFunction, rootdir, region, dirname, startChannel, endChannel, doBaseline, baselineRegion, Sessions, file_extension)
    214             if doBaseline:
    215                 specData = baselineSpectrum(specData,order=1,
--> 216                                             baselineIndex=baselineIndex)
    217 
    218             # This part takes the TOPOCENTRIC frequency that is at

/lustre/pipeline/scratch/GAS/python/GAS/GAS/gridregion.pyc in baselineSpectrum(spectrum, order, baselineIndex)
     16 def baselineSpectrum(spectrum,order=1,baselineIndex=()):
     17     x=np.arange(len(spectrum))
---> 18     coeffs = legendre.legfit(x[baselineIndex],spectrum[baselineIndex],order)
     19 #    pdb.set_trace()
     20     spectrum -= legendre.legval(x,coeffs)

/lustre/pipeline/scratch/GAS/python/anaconda/lib/python2.7/site-packages/numpy/polynomial/legendre.pyc in legfit(x, y, deg, rcond, full, w)
   1554 
   1555     # Solve the least squares problem.
-> 1556     c, resids, rank, s = la.lstsq(lhs.T/scl, rhs.T, rcond)
   1557     c = (c.T/scl).T
   1558 

/lustre/pipeline/scratch/GAS/python/anaconda/lib/python2.7/site-packages/numpy/linalg/linalg.pyc in lstsq(a, b, rcond)
   1917         work = zeros((lwork,), t)
   1918         results = lapack_routine(m, n, n_rhs, a, m, bstar, ldb, s, rcond,
-> 1919                                  0, work, lwork, iwork, 0)
   1920     if results['info'] > 0:
   1921         raise LinAlgError('SVD did not converge in Linear Least Squares')

ValueError: On entry to DGELSD parameter number 6 had an illegal value
jpinedaf commented 8 years ago

any ideas @keflavich @low-sky @rfriesen ?

jpinedaf commented 8 years ago

I also see the problem in numpy 1.9.3

keflavich commented 8 years ago

No immediate idea; there's nothing obvious in the numpy release notes unless it's this: gh-3712: fix LAPACK error handling in lapack_litemodule which came in numpy 1.7.2.

Are you running this on your machine or the GBT machines? I'm wondering if it has something to do with how numpy was compiled. The error suggests that the python code is using the wrong call specification (too many arguments or the wrong arguments) for a C function.

jpinedaf commented 8 years ago

I'm running it on the GBT machines... but we are using the conda distribution

jpinedaf commented 8 years ago

OK, I managed to track down the issue to a dump that is filled with NaNs... at this point I'm thinking of adding a check to make sure we have data like this (https://github.com/GBTAmmoniaSurvey/GAS/blob/master/GAS/gridregion.py#L214),

if doBaseline & np.sum(np.isfinite(specData)):
                specData = baselineSpectrum(specData,order=1,
                                            baselineIndex=baselineIndex)

Although this is not elegant and efficient... suggestions for a better implementation? I'll check this to confirm if solves the issue

keflavich commented 8 years ago

Use np.all(np.isfinite(...)) or np.any(np.isnan(...).

You should probably raise an Exception or a loud warning if all-nan scans are encountered: those are bad and not expected, right?

jpinedaf commented 8 years ago

The pipeline is running again. However, there is no warning implemented, since we don't have such a counter in the loop. At this moment we require that no NaNs should be present in the data, although I'm not sure if it is possible to have a single bad channel in the spectrum.