Closed weaverba137 closed 6 months ago
Thanks for the update and test reorganization. Testing it with
desi_assemble_fibermap -n 20240115 -e 219391 -o $SCRATCH/fibermap.fits
crashes with
Traceback (most recent call last):
File "/global/common/software/desi/users/sjbailey/desispec/bin/desi_assemble_fibermap", line 16, in <module>
sys.exit(assemble_fibermap.main(args))
File "/global/common/software/desi/users/sjbailey/desispec/py/desispec/scripts/assemble_fibermap.py", line 53, in main
fibermap = assemble_fibermap(args.night, args.expid, badamps=args.badamps,
File "/global/common/software/desi/users/sjbailey/desispec/py/desispec/io/fibermap.py", line 1183, in assemble_fibermap
fibermap_hdu = annotate_fibermap(fibermap_hdu, survey=survey)
File "/global/common/software/desi/users/sjbailey/desispec/py/desispec/io/fibermap.py", line 1223, in annotate_fibermap
fibermap_columns = fibermap_columns[survey]
UnboundLocalError: local variable 'fibermap_columns' referenced before assignment
In this case, fibermap_columns
is a module-level global variable, but the function hasn't declared global fibermap_columns
. However, if it did declare that global, I think the statement fibermap_columns = fibermap_columns[survey]
would alter the structure of that global variable, i.e. changing the state of the module thus breaking it for subsequent calls. I think this should be something like:
global fibermap_columns
try:
fmcols = fibermap_columns[survey]
except KeyError:
fmcols = _set_fibermap_columns()[survey]
#... only use fmcols in func, not global fibermap_columns
or consider adding a function get_fibermap_columns(survey)
that would take care of the _set_fibermap_columns()[survey]
logic and return a copy of the fibermap_columns[survey]
. That copy could be safely modified (like adding extra_columns) without impacting the original global variable.
I see that desispec/test/test_io_fibermap.py -> test_annotate_fibermap
(which would have caught this) exists but is not yet implemented. Related: although the full process of assembling a fibermap requires a non-trivial amount of input data, this is implemented as a NERSC-only test in desispec/test/test_fibermap.py, and that test does fail at NERSC for the above reason.
I'll comment about the questions posted in the PR when I can run it and poke around a bit more at how it works in practice.
I've taken a slightly different path, passing in all the columns as a required parameter rather than relying on globals for the annotate_fibermap
function. You may have better luck testing now. I'll also try testing at NERSC.
While testing I found an anomaly for these columns
ERROR:fibermap.py:1245:annotate_fibermap: Unexpected data type, i4 != i8, for column BRICKID!
DEBUG:fibermap.py:1268:annotate_fibermap: Column BRICKID is not supposed to have units, skipping.
INFO:fibermap.py:1276:annotate_fibermap: Setting comment for column BRICKID.
ERROR:fibermap.py:1245:annotate_fibermap: Unexpected data type, i4 != i8, for column BRICK_OBJID!
I think this just means that the top-level fibermap_columns
list is out-of-date. Is there any reason not to just update fibermap_columns
? Also note that desidatamodel has i4
for BRICKID
and BRICK_OBJID
.
Yes, please update fibermap_columns
(and datamodel if needed) to match the output files of iron in /global/cfs/cdirs/desi/spectro/redux/iron:
If any of those files have the same column with different dtypes, that is unexpected and likely a dataflow error and not purposeful. If you find any cases of that, please open a new ticket with examples.
OK, sounds good.
For the downstream files, are there intermediate files that should be examined first, or are those assembled directly from the preproc/night/expid/fibermap file?
To keep them all together, I added the intermediate files to my previous post. The flow is fibermap -> preproc -> frame -> sframe+cframe -> spectra -> coadd.
From scanning those files, the following columns are missing from fibermap_columns
. Included in the list is the first file in which they appear, for cross-referencing with the datamodel. Please add these to fibermap_columns
so that it is a superset of any columns that might exist.
PSF_TO_FIBER_SPECFLUX cframe-r0-00089031.fits.gz [FIBERMAP]
NIGHT spectra-0-1000-thru20210517.fits.gz [FIBERMAP]
EXPID spectra-0-1000-thru20210517.fits.gz [FIBERMAP]
MJD spectra-0-1000-thru20210517.fits.gz [FIBERMAP]
TILEID spectra-0-1000-thru20210517.fits.gz [FIBERMAP]
COADD_FIBERSTATUS coadd-0-1000-thru20210517.fits [FIBERMAP]
COADD_NUMEXP coadd-0-1000-thru20210517.fits [FIBERMAP]
COADD_EXPTIME coadd-0-1000-thru20210517.fits [FIBERMAP]
COADD_NUMNIGHT coadd-0-1000-thru20210517.fits [FIBERMAP]
COADD_NUMTILE coadd-0-1000-thru20210517.fits [FIBERMAP]
MEAN_DELTA_X coadd-0-1000-thru20210517.fits [FIBERMAP]
RMS_DELTA_X coadd-0-1000-thru20210517.fits [FIBERMAP]
MEAN_DELTA_Y coadd-0-1000-thru20210517.fits [FIBERMAP]
RMS_DELTA_Y coadd-0-1000-thru20210517.fits [FIBERMAP]
MEAN_FIBER_RA coadd-0-1000-thru20210517.fits [FIBERMAP]
STD_FIBER_RA coadd-0-1000-thru20210517.fits [FIBERMAP]
MEAN_FIBER_DEC coadd-0-1000-thru20210517.fits [FIBERMAP]
STD_FIBER_DEC coadd-0-1000-thru20210517.fits [FIBERMAP]
MEAN_PSF_TO_FIBER_SPECFLUX coadd-0-1000-thru20210517.fits [FIBERMAP]
MEAN_FIBER_X coadd-0-1000-thru20210517.fits [FIBERMAP]
MEAN_FIBER_Y coadd-0-1000-thru20210517.fits [FIBERMAP]
Seeing that fibermap_columns
is used by desispec.io.fibermap.empty_fibermap()
to create a default empty fibermap, it looks like we need to store somewhere the list of columns in the original fibemap. Due to the coadd split of FIBERMAP from EXP_FIBERMAP, there is no individual fibermap that contains the full set of all columns that exist in any fibermap.
Thank you, @sbailey, this will make it easy to track. Indeed, as you note, I do not want to make a superset of fibermap_columns
, but rather, each stage should keep track of the columns it adds to fibermap_columns
, and then each stage's full set of columns is passed to annotate_fibermap
.
Challenges
fibermap_columns
to pass into annotate_fibermap
. Even if we propagate the units and comments from earlier stages, it appears that the annotate_fibermap
input columns
needs to cover all columns, not just the new ones added by this particular stage.I think this would be simpler and less coupled if fibermap_columns is a superset of any fibermap columns that exists anywhere; empty_fibermap knows which subset of columns it should use; and all uses of annotate_fibermap just provide whatever fibermap they have it annotates the table and only complains about inconsistencies or columns that it doesn't recognize.
Maybe we should have a zoom chat about this?
a given stage might know what columns it added, but it doesn't easily know details about other columns that were added by other stages prior to it
That's a fair point, but I actually want to go through and see how this works in practice.
The BinTableHDU form of the fibermap
Also fair, but I need to see for myself how this works.
I think the situation you propose actually makes things more coupled, not less. Say I want to make a downstream change to a fibermap HDU in a coadd file, then instead of adding the columns to the code that writes the coadd file, I have to go all the way back to the original fibermap code and make an additional change there. Still, annotate_fibermap
could potentially catch inconsistencies like that and stop until the inconsistency is resolved.
Some of this is also relevant to the questions I asked about the logic in the PR description. Have you taken a detailed look at that? I'd be willing to wait until you've reviewed the full logic and questions before chatting on Zoom.
Notes from discussion on Zoom:
fibermap_columns
) rather than input the list of columns each time.survey='special'
is requested.fibermap_columns
could be augmented with an additional column that flags what types of files contain this column; empty_fibermap()
could then use that information to create an appropriate table. The exact nature of this flag is TBD.EXP_FIBERMAP
, etc.desi_preproc
do drop units not just comments, so primary focus should be on getting units back when a fibermap HDU is copied.annotate_fibermap
already, but not uniformly over different filetypes.@sbailey, please add additional notes if I forgot anything.
@sbailey, when you get a chance, I've got some further questions:
desi_assemble_fibermap
, I noticed that it does not use write_fibermap
at all. Is that function orphaned, or is it used elsewhere?fitsio.read()
to read the input fibermap file. I verified that all of the header data is available to read_fibermap()
, including units and comments. However, there is a stage in read_fibermap()
where the header keywords are transferred to fibermap.meta
, via desispec.io.util.addkeys()
, and addkeys()
deliberately drops all table-related keywords, including TUNITxxx
. Why?desi_preproc
, are fibermap HDUs passed along via the intermediate files, or is read_fibermap()
used every time to get the fibermap data for that stage?Table.read()
instead? That preserves the units and automatically adds the keywords to fibermap.meta
. However, comments are lost.TNULLxxx
keywords are added by Table.write()
, that came up in the context of GFA processing, so I thought I'd note that.While looking into the interior of desi_assemble_fibermap, I noticed that it does not use write_fibermap at all. Is that function orphaned, or is it used elsewhere?
From grep write_fibermap $(find py/desispec -name \*.py)
it appears that it is only used by tests to create temporary test datasets.
There is nothing intrinsically wrong with using fitsio.read() to read the input fibermap file. I verified that all of the header data is available to read_fibermap(), including units and comments. However, there is a stage in read_fibermap() where the header keywords are transferred to fibermap.meta, via desispec.io.util.addkeys(), and addkeys() deliberately drops all table-related keywords, including TUNITxxx. Why?
The spirit of addkeys
was to create a fibermap.meta
with the same keys as you would have gotten if you had used Table.read(filename)
instead. That reader also drops "structural" keywords like TTYPE and TFORM as well as TUNIT.
However, comparing a fibermap read with Table.read
, desispec.io.read_fibermap
, and desispec.io.read_fibermap
, the Table.read version includes 'CHECKSUM', 'DATASUM', and 'comments', though comments is from the COMMENT "FITS (Flexible Image Transport System) format is defined in..." not the comments of the table columns.
In pipeline stages after desi_preproc, are fibermap HDUs passed along via the intermediate files, or is read_fibermap() used every time to get the fibermap data for that stage?
preproc is the only stage that reads the original preproc/NIGHT/EXPID/fibermap-EXPID.fits files. Other stages read the FIBERMAP HDU in intermediate files. In some cases that fibermap is merged/split/adapted before writing back out (e.g. N cframe files -> 1 spectra file; 1 spectra FIBERMAP -> coadd FIBERMAP + EXP_FIBERMAP). i.e. even if all of the readers and writers perfectly propagated units and comments internally, we'd still have to chase down all of the merge/coadd/etc. cases, which makes me more interested in the "call augment_fibermap before every write to reset units and comments to what they should be" approach, rather than the "propagate units and comments" approach.
Note: read_fibermap(fp)
means "read the FIBERMAP HDU from fp", and that is also used for reading the FIBERMAP HDU from the intermediate files. i.e. it doesn't mean "read from the one and only original fibermap-EXPID.fits file".
Using Table.read() instead? That preserves the units and automatically adds the keywords to fibermap.meta. However, comments are lost.
The original pipeline default was to only use astropy readers, and any case of using fitsio was for a specific algorithmic and/or performance reason. In this case, it is probably a combination of working around astropy 5.2.x treatment of NaNs and blank strings, as well as performance when reading gzipped files.
As a side-note, I discovered that TNULLxxx keywords are added by Table.write(), that came up in the context of GFA processing, so I thought I'd note that.
Noted. Also see https://github.com/astropy/astropy/issues/4708 about it sometimes picking an incorrect TNULL value, breaking roundtrips when reading it back in...
Finally, on Monday we discussed having a "minimal viable product" by Friday. I created this PR on the basis that "minimal viable product" == "get units into fibermap files", which I have done, but we never did actually define what we meant by "minimal viable product". If "get units into fibermap files" is not the "minimal viable product", then we need to rethink the schedule of this project.
This PR matches what I originally intended, i.e. updating desi_assemble_fibermap to output units and comments even if downstream steps didn't do that yet. However, I think the implementation of annotate_fibermap
here is too restrictive for how we would need to use it downstream, and thus if we merge this as-is we risk having to immediately make non-backwards compatible API / structural changes in the next PR. So I think it's worth sorting out these design issues now, before merging, so that the next PRs can be using the annotate_fibermap
infrastructure without having to make changes to it. OK if that leaks past Friday.
I'm away at a planning retreat on Friday with very limited internet and attention so I likely won't be able to reply more until next week. In the meantime, I still advocate for:
fibermap_columns
has a superset of all columns that appear in any FIBERMAP of EXP_FIBERMAPaugment_fibermap
adds units and comments to whatever columns it receives, without putting other restrictions on what is "supposed" to be there. OK to log an ERROR if it gets a column it doesn't know about.empty_fibermap
/ assemble_fibermap
would use a different mechanism to know what columns should be in the original fibermap file. That could be an additional entry in fibermap_columns
, or it could just be a list of column names to take the subset of fibermap_columns
.We can chat more on Monday.
@sbailey, thank you, agreed with the above. I think it will be instructive to push this through the preproc-file stage, and that I think I can get done today.
As of the latest commit desi_preproc
adds units and column descriptions back to the fibermap table, although as we already knew, other header keywords that had comments in the fibermap file no longer have comments in the preproc file. I notice though that many of the same keywords are repeated in other HDUs, with comments. Would a possible solution be to copy the comments from one of the other HDUs? That doesn't necessarily need to be addressed in this PR though.
Another side question: desispec.io.image.read_image
doesn't read or pass the fibermap HDU to the Image()
object. Is that intentional?
I think I've reached the end of what I can do today. Additional questions for Monday:
io.spectra.write_spectra
. However for frame files it is less obvious. Should I just use the filename being written out?Here is some data on survey = 'special' exposures. The short version is that desi_assemble_fibermap
did run successfully, which is good, and the files were annotated. I ran two exposures:
desi_assemble_fibermap -n 20210512 -e 88150 -o $SCRATCH/fibermap-00088150.fits
(the very first 'special')desi_assemble_fibermap -n 20240211 -e 225032 -o $SCRATCH/fibermap-00225032.fits
(the most recent 'special')It turns out that the value of survey in assemble_fibermap
can't ever be 'special', because that isn't an option, see https://github.com/desihub/desispec/blob/05965ead791589e9d9fa29351d4cdd3a4951d1f7/py/desispec/io/fibermap.py#L682.
Nevertheless, the files generated above have SURVEY = 'special'
in the header keywords. If we want to stick with what already works, we should probably include documentation about why the value of survey
in assemble_fibermap
doesn't always match the value of SURVEY
in the header keyword.
Several unit tests are using empty_fibermap
to create dummy fibermap objects, for example in test_spectra.py. One solution would be to have empty_fibermap
set fibermap.meta['SURVEY'] = survey
, since survey
is an (optional) input to empty_fibermap
.
Weird. Somehow even when I set fmap.meta['SURVEY'] = 'main'
, that somehow disappears before or during the write_spectra
stage. I think it might be dropped by encode_table
.
@sbailey, coming back to the issue of enforcing or not enforcing a specific data model on fibermap HDUs:
empty_fibermap
to only ever create data tables that match what would be produced by assemble_fibermap
, or do we want empty_fibermap
to be able to create empty versions of all possible fibermap HDUs?annotate_fibermap
, we want to loop over the columns in the HDU being annotated, not over the columns desispec.io.fibermap.fibermap_columns
. However, it is still an error if the fibermap HDU contains a column that fibermap_columns
does not know about, correct?Do we want empty_fibermap to only ever create data tables that match what would be produced by assemble_fibermap, or do we want empty_fibermap to be able to create empty versions of all possible fibermap HDUs?
For this PR, I think it is fine for empty_fibermap to only produce the original base fibermap, like it currently does. Some future new option could extend that to other downstream fibermaps, but that feels like feature creep for now.
In annotate_fibermap, we want to loop over the columns in the HDU being annotated, not over the columns desispec.io.fibermap.fibermap_columns.
Agreed
However, it is still an error if the fibermap HDU contains a column that fibermap_columns does not know about, correct?
Let's log an ERROR, but not raise an exception (i.e. it's not supposed to happen, but it also isn't fatal). I think the worse case in that situation is that the unexpected column would not have units and comments, but that's no worse than our current situation where none of the columns having units and comments.
If you want, we could add a strict
option (default False), which if True would raise an exception instead of just logging an ERROR. That could be useful for validation tests. This could also be added later if needed.
Previously I had said that it would be ok to start requiring a unique value of SURVEY for every spectra object, but upon further testing I think that is too restrictive.
Two hiccups related to spectra.fibermap.meta['SURVEY'] now being a required keyword for write_spectra
due to annotate_fibermap
needing it:
This breaks the ability to read+filter+write fuji/edr and iron/dr1 spectra that don't have that keyword, unless the user adds the extra step of setting that keyword. e.g.
from desispec.io import read_spectra, write_spectra
sp = read_spectra('/global/cfs/cdirs/desi/spectro/redux/iron/tiles/cumulative/1000/20210517/coadd-0-1000-thru20210517.fits')
keep = (sp.fibermap['DESI_TARGET'] & 4) != 0
write_spectra('qsotargets.fits', sp[keep])
--> KeyError: "Keyword 'SURVEY' not found."
Workaround: user has to set sp.fibermap.meta['SURVEY'] after reading it, or we auto-derive it for them (better).
Although the pipeline only writes spectra files that have a single SURVEY, users may reasonably want to combine them, e.g. when compiling a QSO sample across sv1/sv2/sv3/main. In that case, desispec.spectra.stack
creates a fibermap that is a superset of the columns in the input fibermaps and the resulting fibermap doesn't correspond to any of sv1, sv2, sv3, main. e.g.
import fitsio
from desispec.io import read_spectra, write_spectra
from desispec.spectra import stack
sp1 = read_spectra('/global/cfs/cdirs/desi/spectro/redux/iron/tiles/cumulative/80960/20210424/coadd-0-80960-thru20210424.fits')
sp2 = read_spectra('/global/cfs/cdirs/desi/spectro/redux/iron/tiles/cumulative/1000/20210517/coadd-0-1000-thru20210517.fits')
spx = stack([sp1[0:10], sp2[0:10]])
write_spectra('testspectra.fits', spx)
--> KeyError: "Keyword 'SURVEY' not found."
In this case, not only is SURVEY not found, but SURVEY isn't even a well-defined unique concept because we have combined spectra from two different SURVEYs.
Problem (1) could be addressed by auto-deriving what the survey is based upon column names, but (2) is trickier since it is fundamentally at odds with the current fibermap_columns
structure split by SURVEY. We may need to revert fibermap_columns
to not be split by SURVEY (like prior to PR #1459) and have it be a superset of any fibermap column across any survey, and then use a separate mechanism for assemble_fibermap / empty_fibermap to know which columns to create for a given survey.
For the record, compared to the SURVEY='main' uncoadded spectra fibermap,
OK, I've thought about this a little bit, and there's a lot here to break down.
This breaks the ability to read+filter+write fuji/edr and iron/dr1 spectra that don't have that keyword, unless the user adds the extra step of setting that keyword. e.g.
That's unfortunate, but I think that the output of assemble_fibermap
should always have SURVEY
set. At least, that's what I've seen in my tests so far. And that's not because assemble_fibermap
necessarily sets the survey internally correctly, but because it copies (lots!) of header keywords from upstream files. We might want to track down where and why fibermap HDU headers start to lose information, but that's a separate ticket/PR.
I'm OK with this, as long as there is one, and only one, function that does this, and it is used by everyone.
As I noted previously, assemble_fibermap
has an internal survey value set, which can be and is demonstrably different from the value in the header keyword.
The value of survey set in empty_fibermap
is not propagated to assemble_fibermap
. In fact, assemble_fibermap
does not use empty_fibermap
at all.
Finally, we need to explicitly account for 'special'.
Yes, I think that is arguing for a single table that has all possible columns. However, before we go ahead, I think we ought to have a review of how fibermap HDUs are propagated overall, not just how their units of comments are propagated. In other words, let's really fix the problem instead of finding additional ones.
As a side note, I think we can eliminate the separate fibermap_comments
because that was only used by write_spectra
and the routine that used it has been replaced by annotate_fibermap
.
I think this may be another Zoom meeting to review everything.
Main survey fibermap columns that don't yet exist in fibermap_columns['main']
:
e.g. in iron/tiles/cumulative/1000/20210517/coadd-0-1000-thru20210517.fits . All other main survey columns in various flavors of FIBERMAP HDUs already exist in fibermap_columns['main']
.
Columns that exist in non-main fibermaps that need to be added to the pan-survey version of fibermap_columns:
OK, I think I saw MEAN_FIBER_X and MEAN_FIBER_Y but forgot to add it previously. The rest are straightforward. Thanx!
Tests are passing again. In the morning I'll focus on the NERSC-only tests to compare the outputs of assemble_fibermap
and empty_fibermap
. I think that can proceed in parallel with testing the full propagation in the pipeline.
@sbailey, I've had successful tests at NERSC, both with test_io_fibermap.py and analyze_fibermap.py. I think the only thing remaining is a full pipeline test to check the propagation.
@weaverba137 please check on the following WARNING messages:
from desispec.io import read_spectra, write_spectra
sp = read_spectra('/dvs_ro/cfs/cdirs/desi/spectro/redux/iron/healpix/main/dark/100/10000/spectra-main-dark-10000.fits.gz')
write_spectra('blat.fits', sp)
--> WARNING: Meta-data keyword TNULL1 will be ignored since it conflicts with a FITS reserved keyword [astropy.io.fits.convenience]
sp = read_spectra('/dvs_ro/cfs/cdirs/desi/spectro/redux/iron/healpix/main/dark/100/10000/coadd-main-dark-10000.fits')
write_spectra('foo.fits', sp)
--> WARNING: VerifyWarning: Card is too long, comment will be truncated. [astropy.io.fits.card]
--> WARNING: VerifyWarning: Card is too long, comment will be truncated. [astropy.io.fits.card]
Please also update write_fibermap
to use annotate_fibermap
, while maintaining support for clobber=True (new file) vs. False (append to existing file) and merging in other header
keywords that aren't in fibermap.meta
. I'm not sure how complete the write_fibermap unit tests are for covering all of these options, so it could be useful to review and update that first.
Note: for new writer function code we are using "overwrite" instead of "clobber", but for backwards compatibility we're not renaming that in old code like write_fibermap.
Other basic checks seem ok, so I'm proceeding with more complete end-to-end pipeline test.
Just to be clear here, the warning you are seeing are not in normal pipeline operations, but just randomly reading a spectra file and then trying to write it back out again?
For this case:
from desispec.io import read_spectra, write_spectra
sp = read_spectra('/dvs_ro/cfs/cdirs/desi/spectro/redux/iron/healpix/main/dark/100/10000/spectra-main-dark-10000.fits.gz')
write_spectra('blat.fits', sp)
--> WARNING: Meta-data keyword TNULL1 will be ignored since it conflicts with a FITS reserved keyword [astropy.io.fits.convenience]
That has nothing to do with anything fibermap-related. The TNULL1
header keyword is set in the SCORES
table, in the original file. I propose that be split into a separate issue.
The header card length warnings are caused by MEAN_PSF_TO_FIBER_SPECFLUX and PSF_TO_FIBER_SPECFLUX, which are by far the longest column names. They are so long that they are eating into the space normally reserved for comments, so the normal comment length test doesn't apply.
Just to be clear here, the warning you are seeing are not in normal pipeline operations, but just randomly reading a spectra file and then trying to write it back out again?
I don't yet know if they appear in full pipeline operations but I suspect they will; I was starting with more isolated checks for easier testing. These warnings do not appear in current main, so they are triggered by something new in this PR.
The warnings about card lengths should now be fixed.
Definitely SCORES
not FIBERMAP
:
>>> from desispec.io import read_spectra, write_spectra
>>> sp = read_spectra('/dvs_ro/cfs/cdirs/desi/spectro/redux/iron/healpix/main/dark/100/10000/spectra-main-dark-10000.fits.gz')
>>> sp.scores = None
>>> write_spectra('blat.fits', sp)
INFO:fibermap.py:1306:annotate_fibermap: Setting comment for column TARGETID.
...
INFO:fibermap.py:1306:annotate_fibermap: Setting comment for column TILEID.
INFO:spectra.py:177:write_spectra: iotime 1.581 sec to write blat.fits at 2024-02-16T11:24:50.861858
'/global/common/software/desi/users/bweaver/perlmutter/desiconda/current/git/desihub/desispec/blat.fits'
>>>
No warnings emitted.
I think the changes requested to write_fibermap
will require eliminating the call to write_bintable
. Is that acceptable?
An end-to-end test of tile5578 on night 20240115 passed, including fibermap units and comments for preproc, frame, sframe, cframe, spectra, and coadd files. The Redrock output does not yet have those units/comments, but that will require a redrock PR. Summarizing tickets addressed here:
For the record, last minute changes I added:
write_bintable(filename, data, ...)
so that BinTableHDU data
can also get augmented header, comments, and units like other types of data
input. This flexibility isn't needed in the current PR, but could be handy in the future.
desiutil.annotate
.@weaverba137 I'm happy with the state of this PR and ready to merge. If you are ok with my latest changes, let's proceed.
A few more tests need to finish, then I'll press the button.
[WIP] Add units to fibermap tables
This PR
Primary functionality
This PR makes a minimally-invasive change to
assemble_fibermap
by only adding units and comments after aBinTableHDU
has already been created. The same function (annotate_fibermap
) should facilitate downstream propagation too. The basic logic is illustrated below in outline form:annotate_fibermap
; we call this justcolumns
below.desispec.io.fibermap.fibermap_columns
or a strict superset.BinTableHDU
matchcolumns
?BinTableHDU
matchcolumns
?columns
?columns
.columns
.columns
?columns
.columns
.Questions
assemble_fibermap
coerces a column order anyway, so it is valid to check the ordering of columns in a fibermap HDU.checkonly
mode might be useful, but that may be feature creep. Is this desirable?annotate_fibermap
function operates in a different mode fromdesiutil.annotate.annotate_fits
. The latter is intended for use after a file has been written out, possibly months or years later, rather than as part of a production pipeline.Other notes
desiutil.annotate
anddesiutil.names
can be used.test_io_fibermap.py
.