Closed boada closed 7 years ago
After a little more investigating, it seems it's something to do with the header of images. When I break the 4 ccd file into 4 single files, but write the original primary HDU to each file I still get the error.
In [47]: hdulist.info()
Filename: image_ccd1.fits
No. Name Type Cards Dimensions Format
0 PRIMARY PrimaryHDU 179 ()
1 im1 ImageHDU 179 (2112, 2048) int32
In [48]: ic1 = ImageFileCollection('.', keywords=keys) # only keep track of keys
...:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-48-85b0c52a8b28> in <module>()
----> 1 ic1 = ImageFileCollection('.', keywords=keys) # only keep track of keys
/home/boada/.local/lib/python3.5/site-packages/ccdproc/image_collection.py in __init__(self, location, keywords, info_file)
101
102 if keywords:
--> 103 self.keywords = keywords
104
105 @property
/home/boada/.local/lib/python3.5/site-packages/ccdproc/image_collection.py in keywords(self, keywords)
198 # Reorder the keywords to match the initial ordering.
199 new_keys.sort(key=keywords.index)
--> 200 self._summary_info = self._fits_summary(header_keywords=new_keys)
201
202 @property
/home/boada/.local/lib/python3.5/site-packages/ccdproc/image_collection.py in _fits_summary(self, header_keywords)
431 continue
432
--> 433 summary_table = Table(summary_dict, masked=True)
434
435 for column in summary_table.colnames:
/home/boada/.local/lib/python3.5/site-packages/astropy/table/table.py in __init__(self, data, masked, names, dtype, meta, copy, rows, copy_indices, **kwargs)
369
370 # Finally do the real initialization
--> 371 init_func(data, names, dtype, n_cols, copy)
372
373 # Whatever happens above, the masked property should be set to a boolean
/home/boada/.local/lib/python3.5/site-packages/astropy/table/table.py in _init_from_dict(self, data, names, dtype, n_cols, copy)
668
669 data_list = [data[name] for name in names]
--> 670 self._init_from_list(data_list, names, dtype, n_cols, copy)
671
672 def _init_from_table(self, data, names, dtype, n_cols, copy):
/home/boada/.local/lib/python3.5/site-packages/astropy/table/table.py in _init_from_list(self, data, names, dtype, n_cols, copy)
633 cols.append(col)
634
--> 635 self._init_from_cols(cols)
636
637 def _init_from_ndarray(self, data, names, dtype, n_cols, copy):
/home/boada/.local/lib/python3.5/site-packages/astropy/table/table.py in _init_from_cols(self, cols)
698 if len(lengths) != 1:
699 raise ValueError('Inconsistent data column lengths: {0}'
--> 700 .format(lengths))
701
702 # Set the table masking
ValueError: Inconsistent data column lengths: {8, 4}
Here's the primary HDU with personal info removed.
SIMPLE = T / File conforms to FITS standard
BITPIX = 8 / Bits per pixel (not used)
NAXIS = 0 / PHU contains no image matrix
EXTEND = T / File contains extensions
NEXTEND = 4 / Number of extensions
FILENAME= 'image.fits' / Original host filename
OBJECT = 'xxxxx' / Observation title
OBSTYPE = 'object ' / Observation type
RADECSYS= 'FK5 ' / Default coordinate system
RADECEQ = 2000. / Default equinox
OBJEPOCH= 2000 / [yr] Epoch of target coordinates
TIMESYS = 'UTC ' / Time system
OBSERVAT= 'KPNO ' / Observatory
TELESCOP= 'KPNO 4.0 meter telescope' / Telescope
TELRADEC= 'FK5 ' / Telescope coordinate system
TELEQUIN= 2000 / Equinox of tel coords
INSTRUME= 'NEWFIRM ' / Mosaic detector
MOSSIZE = '[1:4096,1:4096]' / Mosaic detector size
NDETS = 4 / Number of detectors in mosaic
Thank you for the report, that seems like a severe issue with multi-extension fits files that we should resolve.
But it's really hard to debug this issue without knowing the astropy/numpy version (especially because the error happens in astropy!) and without having these files.
Would it be possible to share some of your files or "similar" files that throw the same error?
Or maybe as alternative could you please add a debug point (for example print(summary_dict)
at line 432 in imagefilecollection to show what could be the trigger of this error?
Note that github supports <details> text </details>
to hide long tracebacks or codes - or just put it in a gist.
In [4]: astropy.__version__
Out[4]: '1.2.1'
In [5]: numpy.__version__
Out[5]: '1.11.2'
I'm not really sure how to go about sharing some files. Maybe some of the non-science data, like a dark frame. I will see if I can put something together.
@boada -- thanks for the additional information. One option (if it isn't too much of a hassle) would be to strip out the pointing information and replace the real data with random number. Another would be to email a file directly to me with the understanding that its content would be kept confidential.
I'll see if I can generate an error just by throwing some multi-extension fits files at ImageFileCollection
.
Just to be clear about the desired behavior, is the idea that you would like, if there are multiple extensions in each file, for the generator to loop over the extensions and files? Or that you be able to extract a specific extension?
I'm trying to reduce some spectroscopic data and just hit this issue. To answer the question above, I could see using both an all-extension-loop functionality and a single extension specification, but right now I'd like to specify the extension.
Happy to provide an example FITS file if that's helpful.
@vrooje -- would you want the same extension in all of the files, or would it vary from file to file? Implementing either should be fairly straightforward (I think)..
I think I'd always be running this in batches where all the files were from the same instrument, so the same extension should work... the instrument does have separate blue and red channels but a) I've been running them separately and b) I think the file structure is still the same, just different dimensions.
I would like to work on this issue. So, just to be clear one last time, @crawfordsm @mwcraig , if the user specifies an extension, you should extract that extension from all files in a given directory?
@vrooje I'm trying to work on solving this issue. Could you provide that sample FITS file?
@janga1997
So, just to be clear one last time, @crawfordsm @mwcraig , if the user specifies an extension, you should extract that extension from all files in a given directory?
Yes, I would imagine adding a keyword like extension
to the list of arguments, and returning the hdu/header/data/etc for that extension. The default right now is to return the first extension, I believe.
b207_os_bs_ff_cr.fits.zip r207_os_bs_ff_cr.fits.zip
Here's a Kast observation of a standard star, one file for the blue channel and one file for the red channel. I wouldn't necessarily include these in the same collection, just making sure you have plenty of data to play with. Hope these uploaded okay...
@boada @vrooje -- I think the underlying issue with these files is not simply that they are multi-extension FITS files (though support for those is still needed).
In the files from @vrooje, the keyword OBSTYPE
shows up twice in the primary header, once with value 'OBJECT'
and again with value 1
. Base on @boada's comment https://github.com/astropy/ccdproc/issues/423#issuecomment-261351015 I'm guessing the same issue occurred there, too.
How should cases like this be handled? Rather than continuing the discussion in this issue, I'm opening up a separate one: #464
Closing issue, but if the fixes do not address your needs, please re-open the issue.
I have a bunch of imaging data where each of the instrument's CCDs have been written to a different extension in the same file:
Running ImageFileCollection on a directory which contains a single image gives:
However, if I break the fits files into files which only contain a single CCD's data, then everything seems to work as it should, but I seem to lose any info about what is contained in the files.