esheldon / fitsio

A python package for FITS input/output wrapping cfitsio
GNU General Public License v2.0
134 stars 58 forks source link

Regression from 1.1.2 to 1.1.3: "keyword not found in header" #325

Closed dstndstn closed 3 years ago

dstndstn commented 3 years ago

Hi,

I'm trying to read a Pan-STARRS DR2 stacked image and while 1.1.2 works, 1.1.3 and 1.1.4 fail with:

  File "/Users/dstn/legacypipe/py/legacypipe/image.py", line 1105, in read_image_header
    self._image_header = self.read_image_fits()[self.hdu].read_header()
  File "/usr/local/lib/python3.8/site-packages/fitsio/hdu/base.py", line 343, in read_header
    return FITSHDR(self.read_header_list())
  File "/usr/local/lib/python3.8/site-packages/fitsio/hdu/base.py", line 358, in read_header_list
    return self._FITS.read_header(self._ext+1)
OSError: FITSIO status = 202: keyword not found in header

A small subimage reproduces the error: http://broiler.astrometry.net/~dstn/temp/sub.fits ie,

> python3 -c "import fitsio; fitsio.read_header('/tmp/sub.fits')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/fitsio/fitslib.py", line 238, in read_header
    return FITSHDR(_fits.read_header(hdunum))
OSError: FITSIO status = 202: keyword not found in header

These images don't have an EXTNAME card, but that doesn't seem to be the issue.

fitsverify doesn't complain about the image.

It does have HIERARCH cards.

dstndstn commented 3 years ago

I'm digging into this in gdb... the error seems to be while reading the "HIERARCH FPA.ZP" header card.... it's maybe doing something weird thinking that the "." is a wildcard?

Read keyname FPA.ZP

Thread 1 "python" hit Hardware watchpoint 6: status

Old value = 0
New value = 202
0x00007fffe201381b in ffgcrd (fptr=0xd252c0, name=0x7fffffffd9b0 "FPAKZP", card=0x7fffffffd6f0 "HIERARCH FPA.ZP =          25. / Magnitude zero point", status=0x7fffffffd858) at getkey.c:638
638     return(*status = KEY_NO_EXIST);  /* couldn't find the keyword */
(gdb) where
#0  0x00007fffe201381b in ffgcrd (fptr=0xd252c0, name=0x7fffffffd9b0 "FPAKZP", card=0x7fffffffd6f0 "HIERARCH FPA.ZP =          25. / Magnitude zero point", status=0x7fffffffd858) at getkey.c:638
#1  0x00007fffe201316b in ffgkey (fptr=0xd252c0, keyname=0x7fffffffd9b0 "FPAKZP", keyval=0x7fffffffd790 "", comm=0x7fffffffd910 "", status=0x7fffffffd858) at getkey.c:437
#2  0x00007fffe2014b49 in ffgkyd (fptr=0xd252c0, keyname=0x7fffffffd9b0 "FPAKZP", value=0x7fffffffd8a0, comm=0x7fffffffd910 "", status=0x7fffffffd858) at getkey.c:1272
#3  0x00007fffe2013075 in ffgky (fptr=0xd252c0, datatype=82, keyname=0x7fffffffd9b0 "FPAKZP", value=0x7fffffffd8a0, comm=0x7fffffffd910 "", status=0x7fffffffd858) at getkey.c:397
#4  0x00007fffe1fa41ae in PyFITSObject_read_header (self=0x7ffff78e8550, args=0x7ffff787d4c0) at fitsio/fitsio_pywrap.c:4388
#5  0x0000000000505a1d in ?? ()
#6  0x000000000056acb6 in _PyEval_EvalFrameDefault ()
#7  0x0000000000568d9a in _PyEval_EvalCodeWithName ()
#8  0x00000000005f5b33 in _PyFunction_Vectorcall ()
#9  0x000000000056fb87 in _PyEval_EvalFrameDefault ()
#10 0x0000000000568d9a in _PyEval_EvalCodeWithName ()
#11 0x000000000068cdc7 in PyEval_EvalCode ()
#12 0x000000000067e161 in ?? ()
#13 0x000000000067e1df in ?? ()
#14 0x000000000067e32f in PyRun_StringFlags ()
#15 0x000000000067e38f in PyRun_SimpleStringFlags ()
#16 0x00000000006b6f1c in Py_RunMain ()
#17 0x00000000006b71ed in Py_BytesMain ()
#18 0x00007ffff7df40b3 in __libc_start_main (main=0x4ef190 <main>, argc=3, argv=0x7fffffffe238, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe228)
    at ../csu/libc-start.c:308
#19 0x00000000005f96de in _start ()
dstndstn commented 3 years ago

Here's a smaller example that shows the same error: http://broiler.astrometry.net/~dstn/temp/tiny.fits

SIMPLE  =                    T / file does conform to FITS standard
BITPIX  =                   16 / number of bits per data pixel
NAXIS   =                    0 / number of data axes
EXTEND  =                    T / FITS dataset may contain extensions
COMMENT   FITS (Flexible Image Transport System) format is defined in 'Astronomy
COMMENT   and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H
HIERARCH FPA.ZP =          25.
HIERARCH CHIP.XSIZE =        0
END

Just having the FPA.ZP card is not enough to trigger the problem; it seems to come when the CHIP.XSIZE card follows. In the original file, there are a string of "HIERARCH FPA." cards followed by a set of "HIERARCH CHIP." cards. Perhaps the "." is acting as a wildcard, and since it appears in a different spot in the string, it not longer matches?! I haven't yet understood the code well enough to know what the wildcard handling is trying to do.

dstndstn commented 3 years ago

(anyway, this seems like a CFITSIO issue)

esheldon commented 3 years ago

I think a period is not allowed in keyword names. From https://fits.gsfc.nasa.gov/standard40/fits_standard40aa-le.pdf

key

dstndstn commented 3 years ago

That's interesting.... I will have to tell my Pan-STARRS1 colleagues...

I'm continuing to dig a bit: I'm looking in gdb, and while processing the FPA.ZP card, I'm seeing:

4362                convert_keyword_to_allowed_ascii(keyname);
(gdb) p keyname
$10 = "FPA.ZP\000\000\002\000\000\000\000\000\000\000h\332\377\377\377\177\000\000\360\227n\367\377\177\000\000\060\250O\fj=g\340!TZ\000\000\000\000\000\035\263\371\341\377\177\000\000\340\354\r\342\377\177\000\000\260\336m\367\377\177\000\000\001\000"
(gdb) n
4363                add_string_to_dict(dict,"name",keyname);
(gdb) p keyname
$11 = "FPAKZP\000\000\002\000\000\000\000\000\000\000h\332\377\377\377\177\000\000\360\227n\367\377\177\000\000\060\250O\fj=g\340!TZ\000\000\000\000\000\035\263\371\341\377\177\000\000\340\354\r\342\377\177\000\000\260\336m\367\377\177\000\000\001\000"

ie, convert_keyword_to_allowed_ascii has replaced the "." by a "K".

and then it goes on to look up the value, and fails because the key name has been changed:

4388                        fits_read_key(self->fits, TDOUBLE, keyname, &dval, comment, &status);
(gdb) p keyname
$13 = "FPAKZP\000\000\002\000\000\000\000\000\000\000h\332\377\377\377\177\000\000\360\227n\367\377\177\000\000\060\250O\fj=g\340!TZ\000\000\000\000\000\035\263\371\341\377\177\000\000\340\354\r\342\377\177\000\000\260\336m\367\377\177\000\000\001\000"
(gdb) n
4389                        add_double_to_dict(dict,"value",dval);
(gdb) p status
$14 = 202
dstndstn commented 3 years ago

Oh, hilarious, it's "K" because it's character 4 of "JUNK___":

https://github.com/esheldon/fitsio/blob/master/fitsio/fitsio_pywrap.c#L80

esheldon commented 3 years ago

cfitsio is trying to cope but it has a bug in its coping mechanism

dstndstn commented 3 years ago

Since these are HIERARCH header cards, I think the part of the standard you refer to does not apply. (That's referring to the first 8 characters, which are HIERARCH in this case. Readers that don't understand HIERARCH should then treat the line similar to COMMENT or HISTORY.) The FITS extension document describing HIERARCH is (like all the extension docs I've read) very light on requirements for what the HIERARCH keyword can actually contain. If I'm being lawyerlike, I think it says they're "ASCII tokens" aside from "=".

Either way, fitsio shouldn't fail to read files like this :)

Can I propose a solution:

Here, https://github.com/esheldon/fitsio/blob/master/fitsio/fitsio_pywrap.c#L4352 what if we only call convert_keyword_to_allowed_ascii for non-HIERARCH headers?

I can put in a PR if that sounds reasonable.

Thanks! -dustin

dstndstn commented 3 years ago

I created #326 for this. "python -m fitsio.test" passes, and it reads my example image correctly. But maybe you don't want to be quite so lax as that...

esheldon commented 3 years ago

I didn't see any mention of HIERARCH in the standard at all

dstndstn commented 3 years ago

It's one of the "registered extensions", https://fits.gsfc.nasa.gov/registry/hierarch_keyword.html

esheldon commented 3 years ago

The pdf https://fits.gsfc.nasa.gov/registry/hierarch/hierarch.pdf makes it clear that there should be no "." in these names.

So this is a malformed header. Did you identify what it was between 1.1.2 and 1.1.3 that causes it to fail?

esheldon commented 3 years ago

Ahh, it is because we introduced the replacement of junk characters in header names :)

esheldon commented 3 years ago

That was a serious issue because not doing so caused seg faults.

Maybe its a book keeping issue: the character gets replaced but then this is not understood somewhere down stream.

dstndstn commented 3 years ago

Yes, that is exactly what happens, the keyword gets altered during header parsing, and then later during header parsing it looks up the value for that keyword using the original name, which then isn't found.

As for whether these headers are legal: in the HIERARCH documentation I guess you're referring to this paragraph,

Under the ESO implementation of this convention, each token string that precedes the equals sign must only contain characters that are legal in formal FITS keywords, i.e., the uppercase letters A through Z, the digits 0 through 9, and the hyphen and underscore characters. The tokens may, however, be longer than the 8 character limit of formal FITS keywords.

which hints that dots aren't allowed, but not in very strong language. That is definitely a theme in the FITS extension descriptions -- in my view, not nearly enough MUST and SHALL statements!

Regardless, since "once FITS, always FITS", files like this must still remain parseable -- I think your option are to ignore the HIERARCH cards (because if you didn't subscribe to the HIERARCH convention, these would be legal), or parse 'em one way or another.

Do you have a test case for the situation that was causing segfaults that required the convert_keyword_to_allowed_ascii call? Because my #326 PR passes all the tests, but I certainly wouldn't want to cause a regression on that segfault issue!

Perhaps I could add a slightly less strict version of convert_keyword_to_allowed_ascii that also allows dots, for use on HIERARCH cards?

Thanks for your time on this, I truly appreciate all the work you put in on fitsio. I think it is far and away the best FITS library for python.

esheldon commented 3 years ago

Is there a way we can modify the code to do the lookup based on the "fixed" keyword name?

esheldon commented 3 years ago

And thank for your kind words, I'm glad its useful to others!

beckermr commented 3 years ago

That paragraph says "must only" and so I think is extremely clear that periods are not allowed.

We could parse periods only as 'p' and then round trip then properly or something.

beckermr commented 3 years ago

Well it's supposed to be a p wrapped in triple underscores but typing is hard today.

dstndstn commented 3 years ago

On the other hand, the paragraph starts "Under the ESO implementation of this convention" :) which makes it not a very strong statement, IMO.

The typical FITS stance would be to be permissive in parsing. I am happy to put in some work to try to make this work. My trivial approach in #326 works for my use case, but Erin said that conversion logic was there to avoid a segfault, so it would be very helpful to have an instance that triggers that segfault case. My feeling would be that in an ideal world, you would allow anything in a HIERARCH that doesn't cause problems elsewhere.

Erin also makes a good point that another approach would be to do the book-keeping so that the later header value lookup is done with the modified keyword key. I can try to implement that if you don't think being more permissive in the HIERARCH header keywords is acceptable. I would claim, however, that the current behavior of replacing any dots with "JUNK___" (by index) is quite surprising to users (it certainly was to me! Why did FPA.ZP turn into FPAKZP!?!). Always using underscores would I think be more understandable.

beckermr commented 3 years ago

Always using underscores would I think be more understandable.

Could create key collisions against other keywords. You'd need something unique like ___fitsio___ or w/e.

esheldon commented 3 years ago

I didn't write convert_keyword_to_allowed_ascii so I have no preferences regarding that code.

esheldon commented 3 years ago

Or at least I don't remember writing it :)

esheldon commented 3 years ago

OK I did write the ascii cleanup for headers: https://github.com/esheldon/fitsio/pull/302

(I'm happy to replace JUNK____... with anything people want.)

The motivation I think was to prevent a seg fault, and it must have worked for that case. But it is clear from the code logic that this will fail when the keyword has a numerical value because it is followed up by a call to fits_read_key and that uses the original keyword name, and would either seg fault anyway if we use the original keyword name, or will not find it if we use the converted name.

We could do a check for when it was converted and in that case return it as a string rather than try to convert to a number

esheldon commented 3 years ago

By the way, I was not able to make a test case for the seg fault because python refused to accept the characters embedded in the code; we could try to do this somehow externally...

dstndstn commented 3 years ago

Since it's a header-parsing issue, can you add an example failing (tiny) image to the repo?

On Tue, Sep 7, 2021 at 1:15 PM Erin Sheldon @.***> wrote:

By the way, I was not able to make a test case for the seg fault because python refused to accept the characters embedded in the code; we could try to do this somehow externally...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/esheldon/fitsio/issues/325#issuecomment-914482230, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIEH7MLRTPA2CVVJUZWB7DUAZCERANCNFSM5C2DE5DA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

beckermr commented 3 years ago

A small bit of test data would be fine IMHO. We could also base64 encode the data or something like that.

esheldon commented 3 years ago

I'm happy to implement specific suggestions

beckermr commented 3 years ago

I can try some stuff if someone sends me an example file with the bad data or points me to one. I don't have a specific suggestion but am more than willing to spend some time on it. :)

esheldon commented 3 years ago

@dstndstn had an example here, maybe we could just pull out a header manually https://github.com/esheldon/fitsio/issues/295

dstndstn commented 3 years ago

I'm sorry, I didn't realize that I had reported (or seconded, at least) that segfault case :) Here is a tiny version of that file, http://broiler.astrometry.net/~dstn/temp/bad-tiny.fits (11 kB). I have not yet tried to rewind fitsio to confirm that before the fix for #295 went in that it still segfaults.

beckermr commented 3 years ago

Thank you! I will look at this.

dstndstn commented 3 years ago

PS, I confirm that this file segfaults with 1.1.2:

> python -c "import fitsio; print(fitsio.__version__); fitsio.read_header('/tmp/bad-tiny.fits')"
1.1.2
Segmentation fault: 11
beckermr commented 3 years ago

We have an 8.4k image in the repo now, so maybe this 11k one is not so bad, assuming we can redistribute it etc.

beckermr commented 3 years ago

This file segfaults on master too BTW.

dstndstn commented 3 years ago

Here's an even tinier one: https://portal.nersc.gov/project/cosmo/temp/dstn/bad-teeny.fits I thought that #302 was intended to fix that segfault. (I confirm that it still segfaults master.)

beckermr commented 3 years ago

I mean the bad-tiny.fits is still segfaulting despite #302.

beckermr commented 3 years ago

@dstndstn can you try the code on this branch?

https://github.com/esheldon/fitsio/pull/330

esheldon commented 3 years ago

these fixes are released in 1.1.5