WCS information for SIP files doesn't survive write/read round trip to FITS file.

rmjarvis commented 9 years ago

PyAst doesn't seem to be writing the information in SIP files correctly when writing them in FITS format. I believe it used to work correctly in a previous version of starlink-pyast, but I can't be 100% sure. I didn't try to go back to previous versions to try them out, so maybe I'm wrong and this was never working.

Anyway, here is a script that shows the problem:

import starlink.Atl as Atl
import starlink.Ast as Ast
import astropy.io.fits as pyfits
import numpy

# File from http://fits.gsfc.nasa.gov/registry/sip.html
hdu = pyfits.open('sipsample.fits')[0]
fc = Ast.FitsChan(Atl.PyFITSAdapter(hdu))
wcs = fc.read()

# A random test position.  The "true" RA, Dec values are taken from ds9.
x = 242
y = 75
true_ra = (13 + 30/60. + 1.474154/3600. - 24.) * numpy.pi / 12.
true_dec = (47 + 12/60. + 51.794474/3600.) * numpy.pi / 180.

ra1, dec1 = wcs.tran( numpy.array([ [x], [y] ]))
print 'Initial read of sipsample.fits:'
print 'error in ra = ',(ra1-true_ra) * 180.*3600./numpy.pi, 'arcsec'
print 'error in dec = ',(dec1-true_dec) * 180.*3600./numpy.pi, 'arcsec'

# Now cycle through writing and reading to a file

hdu2 = pyfits.PrimaryHDU()
fc2 = Ast.FitsChan(None, Atl.PyFITSAdapter(hdu2, clear=False), "Encoding=FITS-WCS")
success = fc2.write(wcs)
print 'success = ',success
fc2.writefits()
hdu2.writeto('test_sip.fits', clobber=True)

hdu3 = pyfits.open('test_sip.fits')[0]
fc3 = Ast.FitsChan(Atl.PyFITSAdapter(hdu3))
wcs3 = fc3.read()

ra3, dec3 = wcs3.tran( numpy.array([ [x], [y] ]))
print 'After write/read round trip through fits file:'
print 'error in ra = ',(ra3-true_ra) * 180.*3600./numpy.pi, 'arcsec'
print 'error in dec = ',(dec3-true_dec) * 180.*3600./numpy.pi, 'arcsec'

The output I get from this is:

Initial read of sipsample.fits:
error in ra =  [ -1.94521707e-06] arcsec
error in dec =  [  5.42684051e-07] arcsec
success =  1
After write/read round trip through fits file:
error in ra =  [-3.15051913] arcsec
error in dec =  [ 0.62446887] arcsec

So the initial read works fine, but then writing to a file and reading it back in leads to an error of several arcsec. The problem is that none of the SIP information is getting output. It writes it as a simple TAN WCS, rather than TAN-SIP.

Am I doing something wrong with the I/O commands? Or is this a bug in PyAst?

Maybe just the return value from fc2.write(wcs) is wrong? It's possible that older versions reported this as 0, in which case the GalSim test suite would have skipped this check, since it would have been reported as not working. But now it claims success, so we check it.

For what it's worth, I had noticed TPV and ZPX types being incorrectly reported as successfully written, and I had a workaround to explicitly countermand the success report in those cases. So maybe SIP is now also just wrongly reporting success here.

timj commented 9 years ago

I fully expect SIP, TPV and ZPX to be working and I worry that you feel you have to work around issues with all of these formats. Thanks for the report. @dsberry will look into it.

dsberry commented 9 years ago

I think the error is in the return value from fc2.write(wcs). AST can read SIP headers but has never supported the writing of SIP headers. In principle, fc2.write(wcs) is supposed to return zero if the pixel to intermediate world coord transformation is non-linear by more than 0.1 of a pixel over the area of the image. The sample SIP file has a pixel size of around 1.2 arc-seconds, and so an error of 3 arc-seconds corresponds to a couple of pixels, and so fc2.write(wcs) should definitely return zero. I'll look into it. Do you have a need for AST to write SIP headers, or is it just that you want the test to fail correctly?

rmjarvis commented 9 years ago

I don't have a specific need for AST to write FITS headers correctly using the standard FITS formats. I think most of the time, it is sufficient that AST can write its own proprietary format correctly. In GalSim, I just have it try the FITS-WCS encoding first, since if that works, it's probably a bit more useful for some users.

Most of the time in GalSim if a user is dealing with SIP (or other PV-type) WCS's, they will be reading them from an existing data file, and that works just fine with AST. And if they do generate a new image that uses one of these, they can probably either just copy over an existing header into the output if they want the output to be FITS-standard or use AST (directly or via GalSim) to read it back in, so the proprietary format will be acceptable.

If some user down the line wants AST to write correct SIP headers in the normal FITS format, I'll let them post that feature request. :)

dsberry commented 9 years ago

I think I've fixed the issue in AST that caused fc2.write(wcs) incorrectly to write out a header. I've uploaded a new version of pyast (3.9.0) to pypi which includes the new version of AST (8.0.5).

rmjarvis commented 9 years ago

Hm. When I run the script I posted above, it still tells me that success = 1. Should this have changed? Does it output success = 0 when you run it?

dsberry commented 9 years ago

It did output success=0 for me, but it seems that was due to a foul up on my part regarding pyast versions on my local machine. Sorry. Having sorted myself out, I now see success=1. This is caused by the fact that the "fc2" fitschan does not contain values for NAXIS1 and NAXIS2 when you do "fc2.write(wcs)", so the write() method does not know how big the pixel grid is. In such cases it uses a default of 100 pixels on each axis when determining the extent of the non-linearity of the WCS. In this particular case, the non-linearity over 100 pixels does not exceed the limit, and so the write() method considers the WCS to be linear.

There are various possible solutions: 1) Modify your script to put the NAXIS values into fc2 before doing "fc2.write()":

fc2['NAXIS1'] = fc['NAXIS1']
fc2['NAXIS2'] = fc['NAXIS2']
success = fc2.write(wcs)

2) Modify your script to re-use "fc" (which will still contain values for NAXIS1/2) instead of creating a new empty FitsChan "fc2" (this has the possible benefit that all the non-WCS headers are propagated to your output fits file):

fc.Encoding = "FITS_WCS"
success = fc.write(wcs)
print('success = ',success)
fc.writefits()

3) Modify AST to use a larger default axis length (1000 pixels may be a more realistic default). 4) Modify AST so that the size of the pixel grid is stashed away somewhere in the FrameSet by the read() method, so that it can be recovered by the write() method and used as a better default to replace the current default of 100 pixels. This is not a guaranteed solution though, since the pixel grid may have been resized outside AST thus rendering the stashed values inaccurate.

rmjarvis commented 9 years ago

In GalSim, at the point where we will want to be writing this, we do know the size of the image to be written, so I can use your first solution. I tried it in our test suite, and it seems to work. Thanks.

However, the TPV test still fails if I don't explicitly override the success=1 report from PyAst. So if you want to look into that as well, here is a corresponding script for that file that doesn't work:

import starlink.Atl as Atl
import starlink.Ast as Ast
import astropy.io.fits as pyfits
import numpy

# http://fits.gsfc.nasa.gov/registry/tpvwcs/tpv.fits
hdu = pyfits.open('tpv.fits')[0]
fc = Ast.FitsChan(Atl.PyFITSAdapter(hdu))
wcs = fc.read()

# A random test position.  The "true" RA, Dec values are taken from ds9.
x = 418
y = 78
true_ra = (3 + 30/60. + 9.340034/3600.) * numpy.pi / 12.
true_dec = -(28 + 43/60. + 50.811107/3600.) * numpy.pi / 180.

ra1, dec1 = wcs.tran( numpy.array([ [x], [y] ]))
print 'Initial read of tpv.fits:'
print 'error in ra = ',(ra1-true_ra) * 180.*3600./numpy.pi, 'arcsec'
print 'error in dec = ',(dec1-true_dec) * 180.*3600./numpy.pi, 'arcsec'

# Now cycle through writing and reading to a file

hdu2 = pyfits.PrimaryHDU()
fc2 = Ast.FitsChan(None, Atl.PyFITSAdapter(hdu2, clear=False), "Encoding=FITS-WCS")
print 'NAXIS = ',hdu.header['NAXIS']
print 'NAXIS1 = ',hdu.header['NAXIS1']
print 'NAXIS2 = ',hdu.header['NAXIS2']
print 'CRPIX1 = ',hdu.header['CRPIX1']
print 'CRPIX2 = ',hdu.header['CRPIX2']
print 'CRVAL1 = ',hdu.header['CRVAL1']
print 'CRVAL2 = ',hdu.header['CRVAL2']
for key in ['NAXIS', 'NAXIS1', 'NAXIS2', 'CRPIX1', 'CRPIX2']:
    fc2[key] = hdu.header[key]
success = fc2.write(wcs)
print 'success = ',success
if True:
    success = fc.write(wcs)
    print 'With original fc: success = ',success
if not success:
    fc2 = Ast.FitsChan(None, Atl.PyFITSAdapter(hdu2, clear=False))
    success = fc2.write(wcs)
    print 'Native encoding: success = ',success
fc2.writefits()
hdu2.writeto('test_tpv.fits', clobber=True)

hdu3 = pyfits.open('test_tpv.fits')[0]
fc3 = Ast.FitsChan(Atl.PyFITSAdapter(hdu3))
wcs3 = fc3.read()

ra3, dec3 = wcs3.tran( numpy.array([ [x], [y] ]))
print 'After write/read round trip through fits file:'
print 'error in ra = ',(ra3-true_ra) * 180.*3600./numpy.pi, 'arcsec'
print 'error in dec = ',(dec3-true_dec) * 180.*3600./numpy.pi, 'arcsec'

My output from this script is:

Initial read of tpv.fits:
error in ra =  [ -5.84623099e-06] arcsec
error in dec =  [  3.62277900e-07] arcsec
NAXIS =  2
NAXIS1 =  512
NAXIS2 =  512
CRPIX1 =  4370.373388
CRPIX2 =  4282.913443
CRVAL1 =  52.8826978013
CRVAL2 =  -28.4436999964
success =  1
With original fc: success =  1
After write/read round trip through fits file:
error in ra =  [-292570.91532092] arcsec
error in dec =  [ 292570.91164516] arcsec

The errors are quite a lot larger than the SIP errors we had been seeing, so it seems like the problem might be something else. I even put in the CRPIX values as well, since I thought they might be relevant. But it still failed. Similarly, using the original fc didn't work either.

If you don't want to tackle this, I'm fine with leaving in the explicit override for TPV that I've had there to force it to use your native encoding rather than FITS-WCS. Here is the code snippet I use to do that:

if 'TPN' in str(self._wcsinfo): success = False

Thanks.

rmjarvis commented 9 years ago

I investigated a bit more, and it looks like the ra,dec values for the TPV file come out reversed. Is there any reason that wcs3.tran( numpy.array([ [x], [y] ])) should be outputting (dec, ra) rather than (ra, dec)?

dsberry commented 9 years ago

I think the issue here is that the WCS in tpv.fits has transposed axes. That is, the diagonal terms in the CD matrix are close to zero, and the off-diagonal elements predominate. In other words, even though CTYPE1 is "RA---TPV", in fact the first pixel axis is more closely aligned with Dec than RA. FITS WCS paper I implies this is bad form - from section 2.1.3 "Therefore, it is good form to transpose the header parameters along with the image so that the on-diagonal terms in the transformation matrix predominate." This is what AST is doing. If you look in the headers of test_tpv.fits you will see that CTYPE1 is "DEC--TPV". There are a couple of possible solutions:

1) Check which axis is which rather than assuming axis 1 is RA. There are several ways you could so this, but maybe the simplest is to use the LonAxis (or LatAxis) attribute:

wcs3 = fc3.read()
if wcs3.getframe(Ast.CURRENT).LonAxis == 1:
   ra3, dec3 = wcs3.tran( numpy.array([ [x], [y] ]))
else:
   dec3, ra3 = wcs3.tran( numpy.array([ [x], [y] ]))

2) Force the FitsChan to use a specific axis order when you write the FrameSet out, by setting the FitsAxisOrder attribute of the FitsChan. For instance:

fc2 = Ast.FitsChan(None, Atl.PyFITSAdapter(hdu2, clear=False),
"Encoding=FITS-WCS,FitsAxisOrder=<copy>")

rmjarvis commented 9 years ago

I can do option 1 I guess. But it seems like the tran function should not require this kind of machination to get a well-defined output. I would consider this a bug in the pyast python layer. Users shouldn't have to know or care how the axes are stored internally.

Perhaps at the very least, you could add a kwarg ra_dec=True that would force the output of tran to be (ra,dec), although I would think making this how it always works would be a better UI.

dsberry commented 9 years ago

Don't forget an AST Frame represents an arbitrary coordinate system, which may not have any RA or Dec axes at all (e.g. the axes of a data value versus time plot for instance). AST is not just about 2-dimensional images of the sky, or even about FITS-WCS - it's a generic system for handling N-dimensional coordinate systems of many different types. So a "ra_dec=True" flag wouldn't really work since there may not be any ra or dec axes (or even any other form of celestial coords such as glon, glat, etc). It would be difficult (and I think unnecessarily restrictive) to mandate a fixed order for all possible combinations of all possible axis types (likewise, FITS-WCS does not mandate any particular order). However, if you want to write your code assuming that the WCS FrameSet has certain axes which are in a certain order, you can use the findframe method. For instance:

#  Read a FrameSet from the FITS header.
wcs=fc.read()

# This app requires axes which represent FK5 (RA,Dec). Use findframe to search the FrameSet for 
# axes which can be converted to (ra,dec). If such axes cannot be determined from the contents of 
# the FrameSet, then a null object is returned by findframe. Otherwise, the returned object is a 
# FrameSet with two Frames - the base Frame is the same as the base Frame of the FITS WCS 
# FrameSet (i.e. it will be pixel coords), and the current Frame is the requested FK5 SkyFrame.
radecWCS = wcs.findframe( Ast.SkyFrame( "system=FK5") )
if not radecWCS:
   print( "The supplied WCS does not define a pair of celestial axes" )

#  Now go ahead and use radecWCS, safe in the knowledge that axis 1 represents FK5 RA, 
#  and axis 2 represents FK5 Dec. This works even if the original FITS file has galactic longitude
#  and latitude axes, or FK4 (RA,Dec) axes, or (Dec,RA) axes, etc, etc.
else:
   x = 418
   y = 78
   ra1, dec1 = radecWCS.tran( numpy.array([ [x], [y] ]))

rmjarvis commented 9 years ago

OK, that's fair. I'll add this findframe line to our code, since that's always what we'll want.

rmjarvis commented 9 years ago

Should that line cause a loss of precision? When I use it in the above script, the errors are ~0.02 arcsec, rather than <1.e-5 arcsec that was was getting before. This is on the original read, not the second one that you said should only be accurate to < 0.1 pixel.

hdu = pyfits.open('tpv.fits')[0]
fc = Ast.FitsChan(Atl.PyFITSAdapter(hdu))
wcs = fc.read()
wcs = wcs.findframe( starlink.Ast.SkyFrame( "system=FK5") )

which leads to

Initial read of tpv.fits:
error in ra =  [ 0.01899386] arcsec
error in dec =  [-0.01494937] arcsec

Since the original WCS was already using the FK5 system, I wouldn't have thought this would do anything to reduce the precision of the calculation.

rmjarvis commented 9 years ago

OK, for the purposes of our use in GalSim, I think I've settled on using the following code snippet when we use pyast to read in a WCS from a FITS file:

wcs = fc.read()
if wcs is None:
    raise RuntimeError("Failed to read WCS information from fits file")
if wcs.getframe(starlink.Ast.CURRENT).LonAxis != 1:
    wcs = wcs.findframe( starlink.Ast.SkyFrame( "system=FK5") )
    if wcs is None:
        raise RuntimeError("The WCS read in does not define a pair of celestial axes" )

Then it will usually use the initial read result, which seems to be the most accurate if it works. But if that doesn't have RA, Dec for its axes (which I think the LonAxis test checks?), then we need to convert to something that does. Then afterward, I can just count on tran working the way that I had expected it to work.

Does this seem right to you?

rmjarvis commented 9 years ago

Oh, and I should probably add that I've decided to leave in the overrides for the success=1 results for SIP, TPV, and ZPX. There are enough situations where registration errors of 0.1 pixels will be too large, so it seems safer to just stick with the pyast proprietary format rather than the lower precision fits encoding in these cases. (e.g. Weak lensing measurements with LSST require a registration accuracy of ~5mas, which is more like 0.02 pixels.)

dsberry commented 9 years ago

Regarding the larger errors when using findframe, I think it is due to the tpv.fits file using ICRS rather than FK5. The tpv.fits header includes:

RADECSYS= 'ICRS ' / Astrometric system

So the DS9 values (which BTW are produced by AST since DS9 uses AST) are ICRS but the values produced by the findframe FrameSet are FK5, which accounts for the errors. Changing the template supplied to findframe from FK5 to ICRS seems to fix the problem.

Would fixing this issue tempt you to move from using lonaxis to using findframe?

rmjarvis commented 9 years ago

I'd rather just test whether the frame is some valid system -- ICRS or FK5 -- rather than force it into one or the other. Is there a better way to test whether the existing Frame is already a celestial system besides using LonAxis?

timj commented 9 years ago

Check if it's domain is "SKY" ?

rmjarvis commented 9 years ago

Nope. It's 'SKY' for the one that has the axes flipped as (Dec, RA).

timj commented 9 years ago

I've had a similar debate to this with respect to NDF sections in the Starlink software. I thought there was an open issue for it on the Starlink github repo but I can't find it now.

I still think I agree with @rmjarvis that a 2-d SKY frame is a single entity that maps RA/Dec or Glong/Glat or whatever to something else. A user always expects RA to come first in the axis specification regardless of how that axis relates to the underlying pixel coordinates in the image. It's worse for NDF sections because the user does not have the ability to programmatically check lonaxis/lataxis. I think that compound frames, e.g. SKY-SPECTRUM or whatever, are a distraction because people aren't expecting SKY-SPECTRUM to act the same as SPECTRUM-SKY but they are expecting within SKY That longitude comes first.

dsberry commented 9 years ago

@rmjarvis The LonAxis test is as good as any other (if the frame is not a skyframe, an exception will be raised saying that he LonAxis attribute is undefined). But beware that you then can't assume anything about the specific coordinate system represented by your WCS FrameSet - it will represent whatever was in the FITS file, whether that be FK5, ICRS, galactic, ecliptic, or whatever.

Checking that Domain is "SKY" does identify celestial systems, since (ra,dec) and (dec,ra) are both celestial systems, as are (glon,glat), (glat,glon), (elon,elat), etc.

dsberry commented 9 years ago

@timj A human putting together an NDF section specification is in the same situation as a bit of software - the first question to be asked is "what do the axes of this data file represent? RA/Dec/Glon/wavelength/MJD/focal planeX/Data value or what?". The answer to this question determines if you can use the file or not. But having answered that question, you will also now know the order of the axes. If I give you an image containing M57, and you want to display the centre of the galaxy, you get the (RA,Dec) of the centre from some catalogue, but before you can display it you need to check that the image I have given you has (RA,Dec) axes. If it instead has (glon,glat) axes for instance then you need to either convert the image to (RA,Dec) or convert your coordinates to galactic. So you run NDFTRACE or whatever to query the properties of the axes. But in so doing you also discover the order of the axes, and hence know how to construct the section specification.

The notion that the axes of a Frame may be permuted into any convenient order is deeply embedded in the AST model, and has been in use a long time in a range of starlink software. Software that uses AST to do all it's WCS handling hardly ever needs to know the axis order - AST takes care of it. The places where you do need to know the axis order is when interacting with some external system, whether it be human or software. It's not unreasonable to expect there to be some interface layer that arbitrates between AST and the external system. After all, it's not only axis order that needs to be arbitrated, but also all the other properties of the coordinate system, like the specific coordinate system in use (FK5/ICRS, wavelength/frequency, UTC/TAI, etc). If you are going to mandate a specific axis order, then it would seem logical to mandate also all the other properties of the coordinate system, and say for instance that all sky positions must be given as ICRS (RA,Dec) - thus excluding ICRS (Dec,RA), FK5 (RA,Dec), (GLat,Glon), (ELon,ELat) etc. This is contrary to the way AST does things - AST aims to give you the freedom to express your coordinate system just how you want to.

dsberry commented 9 years ago

@timj If you had an image in which Dec was parallel to pixel axis 1 and RA was parallel to pixel axis 2 (i.e. north to the right and east upwards), would you still want the WCS axes specified in (Ra,Dec) order? I can see it would be advantageous in some circumstances, but on the other hand there could be other circumstances where people who would say that "axis 1" should be the Dec axis in such cases in order to give a clue that the image is rotated. FITS-WCS paper I follows this later approach in its recommendation that the CD or PC matrix should have its largest terms on the diagonal (i.e. each pixel axis should be associated with the WCS axis that is most nearly parallel to it).

dsberry commented 9 years ago

@rmjarvis There is another alternative to LonAxis that may appeal to you. If you use the findframe approach, but do not specify a system, then the system (i.e. FK5, ICRS, etc) will be left unchanged. So if you do:

wcs = wcs.findframe( Ast.SkyFrame() )

then the only change that will be made to "wcs" is to flip the axis order to match that of a newly created SkyFrame (i.e. RA,Dec). All other properties of the coordinate system should remain unchanged.

rmjarvis commented 9 years ago

wcs = wcs.findframe( Ast.SkyFrame() )

This works. I'll use that. Thanks!

If you had an image in which Dec was parallel to pixel axis 1 and RA was parallel to pixel axis 2 (i.e. north to the right and east upwards), would you still want the WCS axes specified in (Ra,Dec) order?

Yes. Always. I don't see any problem with having a CD or PC matrix that is close to (0,1,1,0) or (0,1,-1,0) or (1,0,0,-1) or (0.6,0.8,-0.8,0.6) or any other rotation or flip. Alt-Az telescopes (LSST for instance) will frequently change between being near theta = 0 and theta = 90. Would you advocate changing the FITS WCS convention in the image files every time the rotator passes through 45 degrees?

At the very least I always want my wcs software to return (ra,dec) when I ask it for the celestial position at some point in the image. But at least I now have a programmatic way to get PyAst to do that, so I'm fine with the SkyFrame() solution.

rmjarvis commented 9 years ago

Just to follow up on this. As far as I'm concerned, I'm happy with the resolution here, so this issue can probably be closed. Thanks for the help with that.

I do think the user interface of the tran command could be improved somewhat to make it more obvious how to get (ra,dec) out of it reliably. Like a kwarg ra_dec that raises an exception if the frame isn't about RA and Dec perhaps, but in the normal case makes sure to return (RA,Dec) in that order. I think that could be a bit more intuitive than the current situation where the user needs to know what order the Frame stores its axes.

Or if you don't like that, it could be helpful to add something to the doc string to let people know about the option of using findframe(Ast.SkyFrame()) to get a version of the wcs that does return (RA,Dec) in that order. I'm sure I'm not the only user for whom this is desired behavior. Most people probably just don't realize that their code is fragile to the possibility of having a wcs with axes in the opposite order.

But if you don't want to do any of these, feel free to just close this issue and be done with it. These are just suggestions from the peanut gallery. :)

dsberry commented 9 years ago

Thanks for the suggestions. I'll probably go with your second option of adding further explanations to the docs. The issue with the first as I mentioned before is that it sort of assumes AST will only ever be used with plain (ra,dec) wcs. The input WCS could in principle represent almost anything - a (time,focal planeX,focal planeY) 3-D cube, or a 1-D spectrum for instance - so the principle is "when importing WCS into AST from some external source, always allow for the possibility that the WCS axes may not match the requirements of your application, in type or order".

dsberry commented 9 years ago

And my problem with the kwarg ra_dec that raises an exception is that it's a special case. In principle, if such an option existed, you should also have similar options for all other possible combination of axes. In fact I suppose the findframe approach is a generalised equivalent to what you suggest.

dsberry commented 9 years ago

I've added an example using findframe to the the pyast examples page at http://timj.github.io/starlink-pyast/node4.html (look for "Ensure that WCS Axes are in a Specified Order"). I've also added comments to the docs for the tran and findframe methods linked from http://timj.github.io/starlink-pyast/pyast.html.

rmjarvis commented 9 years ago

Good. Thanks for this. One slight typo I noticed:

print( "Axis 3: {0}".format( newwcs.Label_2 ) )

Should be Axis 2.

dsberry commented 9 years ago

Fixed. Thanks for pointing it out.

Starlink / starlink-pyast

WCS information for SIP files doesn't survive write/read round trip to FITS file. #8