Closed weaverba137 closed 1 year ago
These were intended to be T (boolean), not 'True ' (string). Please point to example files that use 'True ' so that we can track down where that comes from.
Here are a few examples:
fuji/healpix/special/dark/326/32637/coadd-special-dark-32637.fits
fuji/healpix/special/dark/326/32637/spectra-special-dark-32637.fits
guadalupe/healpix/main/bright/263/26371/coadd-main-bright-26371.fits
guadalupe/healpix/main/bright/263/26371/spectra-main-bright-26371.fits
these are just a few that were flagged by validating the data model. Since these files were chosen at random, it must be the case that most files in SPECPROD/healpix
have this issue, such as redrock files.
The zcatalog files, zall-pix-SPECPROD
and zpix-SURVEY-PROGRAM
also have this issue.
I'm not sure I'm the correct person to generally diagnose this issue, although I'll certainly loop through the desitarget
files specifically to check that the issue isn't being passed downstream from the target files.
Two immediate thoughts:
from a quick look, I d say that those are not propagated from desitarget, and it looks like it s a matter of converting to string in the process.
this HPXNEST keyword is set here: https://github.com/desihub/desispec/blob/6f8caa05e0a39a768a93f8267f0b77d467080988/py/desispec/scripts/healpix_redshifts.py#L81-L87
and here in the desispec.scripts.tile_redshifts.write_redshift_script() function, where a headeropt
string is built:
https://github.com/desihub/desispec/blob/4885ad27a24a6cb08d1d3c6f3ec463302770fe89/py/desispec/scripts/tile_redshifts.py#L337-L339
that headeropt
string is then is fed to desi_group_spectra
, which interprets values here:
https://github.com/desihub/desispec/blob/4885ad27a24a6cb08d1d3c6f3ec463302770fe89/py/desispec/scripts/group_spectra.py#L155-L165
so, maybe add another try/except
case for booleans?
Thanks, Anand. I agree that this looks like the culprit:
if extra_header is not None:
for key, value in extra_header.items():
headeropt += f' {key}={value}'
my understanding is that you ve to switch to string, as this is then fed to build a command line.
but one could catch it back in group_spectra.py
with an additional:
try:
spectra.meta[key] = bool(value)
I would not recommend that code, since almost everything that is not explicitly False
, None
, ''
, []
, etc. is considered True
. In other words, bool(value)
will return True
instead of throwing an exception for almost any value.
I see, thanks for pointing that (it s quite counter-intuitive!).
Maybe something like value.strip() == 'True'
? you would know best..
That has how Python has always worked, so counter-intuitive or not, this is extremely important to be aware of.
I think the idea that args.header
is just a string is bad. You need to go back to the actual FITS header keywords as derived from astropy
or fitsio
, then examine their type and set the type of the keywords appropriately. Or just copy the keywords from astropy
or fitsio
rather than going through an intermediate stage of converting them to a string.
@sbailey, this issue may be already on the way to being fixed as part of DR1 patching.
I think the idea that args.header is just a string is bad. You need to go back to the actual FITS header keywords as derived from astropy or fitsio, then examine their type and set the type of the keywords appropriately. Or just copy the keywords from astropy or fitsio rather than going through an intermediate stage of converting them to a string.
For context, these are being parsed as strings because they are new keywords coming in via a command line option desi_group_spectra --header KEY1=VAL1 [KEY2=VAL2] ...
so there is no prior information about what the type is. They are not being propagated from an input file.
I'll submit a PR soon to fix this.
Ah, OK, thanx.
Fixed by PR #2110 .
In files grouped by healpixel,
HPXNEST
is supposed to denote that end-users should use the nested healpixel scheme. We have recently discovered that some FITS files encode the value as'True '
(some number of spaces is standard), as opposed toT
which is a proper FITS boolean. Which type is this keyword meant to have, and assuming it is meant to be a true boolean, where in the pipeline does it get converted to a string?There is some danger that if end-users have code that interpret the
HPXNEST
keyword as a pure boolean, then a value of'False '
would be interpreted asTrue
, because strings with non-zero length are interpreted asTrue
in Python.