Open demitri opened 7 years ago
As you can see I am actually doing the TZERO myself, cfitsio is not doing it.
There is no natural way to do deal with this given that the column is in fact declared as signed 8 byte. This is a giant kludge propagated in the cfitsio code base. It can work there because the user can always cast the type as needed.
However fitsio always assumes that you want to read into the type that is declared in the file; how else can we sensibly interpret the file without user override? Also fitsio internally reserves the right to read the bytes directly for big efficiency gains (hence doing tzero myself), so remapping the type won't work.
A dumb kludge would be for the user to tell fitsio to override the data type, but this would require twice the memory, since we would have to make a new version of the data and copy into it.
I completely agree; the whole thing is hacky.
While the column is declared as a 64-bit long signed integer (TFORM "K"), the FITS spec says that IF the declared type is a signed 64-bit integer AND TZERO = 9223372036854775808 (2^^63), then the column's physical data type is an unsigned 64-bit integer. See Table 11 in the FITS standard. It's a hack, but that's what the specification is. The cfitsio library doesn't help in that it doesn't support it.
However fitsio always assumes that you want to read into the type that is declared in the file; how else can we sensibly interpret the file without user override?
So, given the above, the column is in fact declared (in a completely hacky way) to be an unsigned 64-bit integer. No assumptions or second-guessing need to be made; it's what the file is unambiguously declaring. (A hack, yes, but unambiguous.)
I would request that these circumstances be recognized and the correct data type be applied. From the table referenced above, this applies to signed bytes, and unsigned 16-, 32-, and 64-bit integers. The alternative is that fitsio crashes (as it did for me), and the user having to manually set the data type for each column like this, which isn't really preferable.
In the meantime, is there a way I can read this data with the current version of fitsio by manually specifying the data type? Astropy really doesn't handle reading row by row for a 12GB HDU.
Another quick note - while cfitsio does allow you to specify the data type to read a column into, it will crash with a numeric overflow error reading such a columns internally before it gets to that point.
I see.
Yes, I agree the data type can be inferred for these types.
No there is no way to do this in fitsio currently.
I'm reading a single row from the latest SDSS DR14 spAll file (
spAll-v5_10_0.fits
) and am getting this error:I ran into this problem in Nightlight as well. The problem can be reduced to reading this cell in the table:
These are the header values for the relevant column:
The physical value should be: 11258999466550599680. The combination of the "K" TFORM value and the TZERO value is how the FITS format allows unsigned longs. The cfitsio library does not currently support unsigned long data types and throws a numeric overflow error. The physical value is calculated as:
value in table
* TSCAL + 9223372036854775808 = 11258999466550599680I've received no detail on when this might be fixed in the underlying library (i.e. not soon). The patch (given to me by Bill Pence) is to set TZERO for the column to 0, read the data from the file via cfitsio as stored in the file, then add "9223372036854775808" to the values yourself.
For your wrapper, you'd probably want to look for those TSCAL and TZERO values and change the mapped C type to "unsigned long" (but only then so as not to waste memory). Note also that this applies to several unsigned int types (see the FITS specification).
This will be an issue for many SDSS DR14 files, which (I'm sure you know) is due next month.