krakrjak / fits-parse

Parse FITS files for Astronomy data analysis.
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Support for 64 bit numbers #8

Open seanhess opened 2 months ago

seanhess commented 2 months ago

Hey @krakrjak,

Right now we parse numbers as either Int or Float, but these don't carry enough precision to parse the number of digits available in the fits spec. They both only support about 10 significant digits.

I need a way to support Int64 and Double (64 bit float)

Value is currently defined as:

data Value
    = Integer Int
    | Float Float
    | String Text
    | Logic LogicalConstant
    deriving (Show, Eq)

Can we refactor that to this?

data Value
    = Integer Int64
    | Float Double
    | String Text
    | Logic LogicalConstant
    deriving (Show, Eq)

Any thoughts on other approaches?

krakrjak commented 2 months ago

Hi @seanhess,

Yeah, I see the problem. Let me look at this a little closer. I've been at ICFP this last week and will be traveling around Italy until Wednesday September 18th. So don't be surprised if my responses are delayed a bit.

IIRC internally we do take into account all the different types of bit formats supported using the BitPixFormat.

{-| The 'BitPixFormat' is the nitty gritty of how the 'Axis' data is layed [sic]
    out in the file. The standard recognizes six formats: unsigned 8 bit
    integer, two's complement binary integers at 16, 32, and 64 bits along
    with 32 and 64 bit IEEE floating point formats.
-}
data BitPixFormat =
      EightBitInt       -- ^ BITPIX = 8; unsigned binary integer of 8 bits
    | SixteenBitInt     -- ^ BITPIX = 16; two's complement binary integer of 16 bits
    | ThirtyTwoBitInt   -- ^ BITPIX = 32; two's complement binary integer of 32 bits
    | SixtyFourBitInt   -- ^ BITPIX = 64; two's complement binary integer of 64 bits
    | ThirtyTwoBitFloat -- ^ BITPIX = -32; IEEE single precision floating point of 32 bits
    | SixtyFourBitFloat -- ^ BITPIX = -64; IEEE double precision floating point of 64 bits
    deriving (Eq)

Along with the Pix type.

- | Following `BitPixFormat` we have a tag for integer and floating point
     values. We box them up to ease parsing.
-}
data Pix = PB Int | PI16 Int | PI32 Int | PI64 Int | PF Double | PD Double

Does PI64 need to be Int64? The Pix Int types aren't specifically sized and may be getting truncated (even PI32 probably doesn't cover all of Int32 see Data.Int ).

Which is used in the binary data parsing via getPix.

getPix :: BitPixFormat -> Get Pix
getPix EightBitInt       = PB . fromIntegral <$> getInt8
getPix SixteenBitInt     = PI16 . fromIntegral <$> getInt16be
getPix ThirtyTwoBitInt   = PI32 . fromIntegral <$> getInt32be
getPix SixtyFourBitInt   = PI64 . fromIntegral <$> getInt64be
getPix ThirtyTwoBitFloat = PF . realToFrac <$> getFloatbe
getPix SixtyFourBitFloat = PD . realToFrac <$> getDoubleb

Remember the Value data type is only for Keywords in an HDU not the data in the FITS data cube payload. Do we need to support these same formats in Keywords as well? Do you have a reference to this or an example file to add to the repo that needs this addition? I'm certainly not against growing the Keyword payload parsing as needed.

seanhess commented 2 months ago

Right, sorry, I should've been more clear. I'm only talking about headers. I noticed while writing code to embed headers for multiple NSO Fits files into a single Asdf file. Our Level 1 data uses 64 bit values for these.

So, you're right, we can parse data as 64 bit, but not headers.

Take a look at these WCS headers from NSO's data:

WCSNAME = 'Helioprojective-cartesian'
WCSNAMEA= 'Equatorial equinox J2000'
CRPIX1  =   -85.86772915846299 / [pix]
CRPIX2  =                477.0 / [pix]
CRPIX3  =    17.52015992216496 / [pix]

Those CRPIX values have 16 significant digits, so we would lose precision if I tried to read the header and encode it as 64 bit later.

I need to be able to parse/store Headers as 64 bit values. Maybe it makes sense to "upgrade" the defaults for Value: Integer and String to 64 bit?

krakrjak commented 1 month ago

Are we still at a stage that we need to solve this in the headers? Do we need to do more plumbing to ensure 64 bit Double and Integer values are properly supported? This seems like a good deviation from the standard to adopt as it's less surprising than dropping accuracy in header values.

We likely need to document this extension support in the README as well.

And additional question. These header entries are

CRPIX1  =   -85.86772915846299 / [pix]
CRPIX2  =                477.0 / [pix]
CRPIX3  =    17.52015992216496 / [pix]

Are those the exact header strings we need to parse? So a 64 Bit floating point number / and [pix] ?

seanhess commented 1 month ago

Sorry I forgot about this, I’ll make a PR next week.

On Tue, Oct 1, 2024 at 4:51 PM Zac Slade @.***> wrote:

Are we still at a stage that we need to solve this in the headers? Do we need to do more plumbing to ensure 64 bit Double and Integer values are properly supported? This seems like a good deviation from the standard to adopt as it's less surprising than dropping accuracy in header values.

We likely need to document this extension support in the README as well.

— Reply to this email directly, view it on GitHub https://github.com/krakrjak/fits-parse/issues/8#issuecomment-2387220627, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAD66JDNEJYJGBZON5LHY3ZZMROLAVCNFSM6AAAAABN2GFTHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBXGIZDANRSG4 . You are receiving this because you were mentioned.Message ID: @.***>

seanhess commented 3 weeks ago

PR sent!