astropy / specutils

An Astropy coordinated package for astronomical spectroscopy. Maintainers: @rosteen @keflavich @eteq
http://specutils.readthedocs.io/en/latest/
170 stars 127 forks source link

SpectrumCollection.read(format='iraf') not reading IRAF multispec non-linear WCS correctly? #1196

Open kgozman6159 opened 6 days ago

kgozman6159 commented 6 days ago

Hi! I'm having a problem getting specutils to read in a multispec (non-linear) spectrum generated from IRAF. This is a FITS file that has 128 spectra, all with different non-linear wavelength solutions and MULTISPEC wcs. Specifically, it doesn't seem to return the correct wavelength arrays for each spectrum.

The relevant part of the FITS header for this file is this:

WCSDIM  =                    2                                                  
LTM1_1  =                   1.                                                  
LTM2_2  =                   1.                                                  
WAT0_001= 'system=multispec'                                                    
WAT1_001= 'wtype=multispec label=Wavelength units=angstroms'                    
WAT2_001= 'wtype=multispec spec1 = "1 1 2 6830.1450104837 0.98385178277893 2048'
TRIM    = 'Oct 18 23:19 Trim data section is [1:1024,1:1028]'                   
OVERSCAN= 'Oct 18 23:19 Overscan section is [1025:1152,1029:1156] with function'
ZEROCOR = 'Oct 18 23:19 Zero level correction image is m2bias1x2$bbiasc1.fits'  
DARKCOR = 'Oct 18 23:19 Dark count correction image is m2dark1x2$bdarkc1.fits w'
CCDSEC  = '[1:1024,1:1028]'                                                     
CCDMEANT=           1413760784                                                  
CCDPROC = 'Oct 18 23:19 CCD processing done'                                    
BIASRON =           0.03267032                                                  
DRKRON  =            0.1671258                                                  
EGAIN1  =                 0.68                                                  
ENOISE1 =                  2.7                                                  
EGAIN2  =                 0.64                                                  
ENOISE2 =                  2.3                                                  
EGAIN3  =                 0.66                                                  
ENOISE3 =                  2.4                                                  
EGAIN4  =                 0.67                                                  
ENOISE4 =                  2.6                                                  
CRREMOVE= 'Done    '                                                            
CTYPE1  = 'MULTISPE'                                                            
CTYPE2  = 'MULTISPE'                                                            
CDELT1  =                   1.                                                  
CDELT2  =                   1.                                                  
CD1_1   =                   1.                                                  
CD2_2   =                   1.                                                  
WAT2_002= ' 0. 65.06 71.06 1. 0. 2 5 5.84652757644654 1904.20617675782 7770.214'
WAT2_003= '84537882 935.937441361387 -0.71291683107836 -1.33149900358178 -0.002'
WAT2_004= '06181886509034" spec2 = "2 2 2 6828.1026092208 0.98392751525287 2048'
WAT2_005= ' 0. 74.77 80.77 1. 0. 2 5 7.94382667541505 1906.15625 7770.201972390'
WAT2_006= '11 935.939084962204 -0.710611335979038 -1.34489317746376 0.008385815'
WAT2_007= '89787007" spec3 = "3 3 2 6826.3700464052 0.98407934527917 2048 0. 84'
'   

Note that this goes on for many more lines in the form of WAT2_XXX, I just haven't copied them here for readability.

I'm taking the file with the above header and running it through the following code snippet:

from astropy import units as u 
import pprint
from specutils import Spectrum1D
with fits.open('<my_file_name>.fits') as hed:
    hed[0].header['BUNIT'] = 'du/pixel'
    test_multi = SpectrumCollection.read(hed, format='iraf')

print(np.array(test_multi.wavelength[0]))

This returns [6830.97577634 6831.95528375 6832.93481091 ... 8842.30021791 8843.27149672 8844.24274753]

This result is different from both trying to read in this spectrum using another code and also cross-checking with IRAF itself. The actual wavelength grid for this spectrum should be [6830.14501048 6831.12483153 6832.10467235 ... 8842.14641611 8843.11802698 8844.08960983]

which also matches what I would expect from the header of the FITS file.

I'm running this in a Jupyter notebook on a 2020 Macbook Pro running Sonoma 14.7.1 (23H222). I'm using python 3.11.7, specutils 1.19.0, and astropy 6.0.0.

I believe this is related to #708, I'm not sure if the feature of reading multispec non-linear 2D WCS has been fully implemented or not or why I'm getting the wrong wavelength array (as compared to IRAF itself).

Any suggestions or thoughts would be appreciated!

rosteen commented 5 days ago

Thanks for the report, I probably won't have time to dig into this for a week or two, but it's on my radar. I'll see if anyone else has some time to look into it (@dhomeier, I think you wrote the relevant loader?)

dhomeier commented 1 day ago

Hmm, that turned out a bit trickier – I was initially thinking this could rather be related to #1127, but that's dealing with WCSDIM=3 cases and a WCS type that is indeed not fully implemented yet, since the functionality does not exist in WCSLIB (but looking at the present case, trying to resort to astropy.wcs may in fact have been the cardinal error there).

The difference actually arises simply from the non-integer values in the pmin, pmax from WAT2_002= ' 0. 65.06 71.06 1. 0. 2 5 5.84652757644654 1904.20617675782 (last 2 numbers following the ftype=2 and order=5 for Legendre series to 4th degree). These are truncated to int in _read_non_linear_iraf_wcs, adopted from the earlier single-spec implementation from #192 and its predecessor in #18. For that first version non-integer values would indeed not have been that straightforward to implement, and may seem disingenuous as they supposedly represent pixel values (also all example and test spectra have integer-valued limits here). However Legendre1D accepts non-integer inputs to domain, and with the float values exactly reproduces the wavelengths from readmultispec (which is using the same mathematical model). So the fix is quite straightforward and trivial, just may need some discussion if this is the correct treatment. But if IRAF itself gives those same numbers, there should not be much doubt. https://iraf.readthedocs.io/en/latest/tasks/noao/onedspec/specwcs.html is also using all integer-valued examples, but does not otherwise specify allowed inputs.

Thanks for reporting this!

dhomeier commented 18 hours ago

disingenuous as they supposedly represent pixel values (also all example and test spectra have integer-valued limits here).

@rosteen as it happens the 2 non-linear_fits_solution test spectra actually do have fractional domain limits, so if we adopt the above reading from IRAF, comparing them to the integer-domain functions, as test_iraf_non_linear_chebyshev and test_iraf_non_linear_legendre do, is incorrect. Should just those tests be modified, or a second set of files or modified headers be tested with integer pmin, pmax in WAT2_002?