astrogo / fitsio

fitsio is a pure-Go package to read and write `FITS` files
BSD 3-Clause "New" or "Revised" License
54 stars 22 forks source link

support writing of image cubes #50

Closed brandondube closed 4 years ago

brandondube commented 4 years ago

this fix begins to allow fitsio to properly write image cubes, but does not fully complete that work (I think). I don't know whether NAXIS > 3 would violate the standard, but this expansion allows it to be arbitrary without special casing 1 and 2 dims.

My application is streaming images from a wavefront sensor that runs at 500Hz to disk for archive parallel to the control loop. I double buffer the images and roll them up into bundles of up to 10k, which produces files that look like 0-9999.fits, 10000-19999.fits, and so forth. These will be ~800MB a piece. I expect each to be a cube of 10000xNxM. This chunking allows some level of "streaming" and avoids cfitsio's 2GB limit for readers in matlab and python.

it would be pretty optimal if *fitsio.imageHDU allowed multiple calls to write, but the buffer size internal to fitsio is set on write, so there will be something like data corruption from multiple calls to write. I would expect that this PR would include additional commits that make those changes. I am not yet familiar enough with fitsio's guts to finish making them.

brandondube commented 4 years ago

o/

This PR isn't really done - I opened it early since fitsio seems relatively maintenance mode and I figured there would be some delay where I can make the rest of the changes in parallel. The third commit, 20c75ecb4011c956524f4cbd3b8e8e0c6545ff59 actually rolls in several of your comments ;)

I'll finish the changes and add some tests. I'm doing this at work, and the files it's making at the moment are "valid" in that they more or less meet the spec (astropy can open them). But the actual data is garbage, which I don't understand yet.

GSFC's FITS primer says you can have 1 to 999 axes; I'll add that check to the loop.

codecov-commenter commented 4 years ago

Codecov Report

Merging #50 into master will increase coverage by 0.05%. The diff coverage is 55.55%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #50      +/-   ##
==========================================
+ Coverage   68.69%   68.75%   +0.05%     
==========================================
  Files          14       14              
  Lines        2923     2922       -1     
==========================================
+ Hits         2008     2009       +1     
+ Misses        769      768       -1     
+ Partials      146      145       -1     
Impacted Files Coverage Δ
phdu.go 70.21% <ø> (+16.11%) :arrow_up:
image.go 47.41% <20.00%> (-1.31%) :arrow_down:
header.go 68.26% <100.00%> (-0.28%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 82a46ea...e00cf48. Read the comment docs.

brandondube commented 4 years ago

o/

I added the NAXIS>999 check as well as a round tripping test. If you actually write a file to disk, you can use astropy (wraps cfitsio) to check:

In [5]: fits.getdata('foo.fits')                                                
Out[5]: 
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11],
        [12, 13],
        [14, 15]],

       [[16, 17],
        [18, 19],
        [20, 21],
        [22, 23]],

       [[24, 25],
        [26, 27],
        [28, 29],
        [30, 31]],

       [[32, 33],
        [34, 35],
        [36, 37],
        [38, 39]],

       [[40, 41],
        [42, 43],
        [44, 45],
        [46, 47]]], dtype=uint16)

In [6]: fits.getdata('foo.fits').ravel()                                        
Out[6]: 
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47],
      dtype=uint16)

This is 2 x 4 x 6 instead of the sequence 2x3x6, but you can see the data comes out correctly. The header is:

In [7]: fits.getheader('foo.fits')                                              
Out[7]: 
SIMPLE  =                    T / primary HDU                                    
BITPIX  =                   16 / number of bits per data pixel                  
NAXIS   =                    3 / number of data axes                            
NAXIS1  =                    2 / length of data axis 1                          
NAXIS2  =                    4 / length of data axis 2                          
NAXIS3  =                    6 / length of data axis 3                          
BZERO   =                32768 /                                                
BSCALE  =             1.000000 / 

I did not add a test for panic on naxis > 999.

sbinet commented 4 years ago

hi Brandon,

could you let me know (with a simple PTAL or something) when that PR is ready? thanks a lot (and thanks for the interest in astrogo/fitsio)

brandondube commented 4 years ago

oh, sorry -- the previous comment was meant as PTAL :)

It is unrelated to this PR, but I tried to refactor fitsio.imageHDU with a bytes.Buffer to allow the images to be appended -- here. If you could take a look there, too, I'd appreciate it. The fitsio code is a lot simpler, but it's also twice as slow and moves around twice as much memory and I don't know why.

sbinet commented 4 years ago

needs https://github.com/astrogo/license/pull/4