Closed weaverba137 closed 2 months ago
thanks for initiating that. some complementary informations:
in short: these five exposures are poor-quality data from sv1 or mainbackup), where the gfa pipeline returns NaN's for some reason. I ve not investigated why, not sure it s worth it. can t we leave NaN's? if not, we could patch with values like -99, which clearly is a non-physical value for those quantities?
in details:
about the expids: these five exposures are poor-quality data; from jura:
EXPID NIGHT FAFLAVOR EXPTIME EFFTIME_SPEC
------ -------- ---------- ------------------ ---------------------
74307 20210202 sv1elgqso 566.04931640625 1.3787630796432495
79769 20210308 sv1ssv 300.06988525390625 5.455439168144949e-05
83420 20210404 sv1elg 80.4530029296875 2.655843309185002e-05
110852 20211125 mainbackup 601.6654663085938 17.180953979492188
190752 20230815 mainbackup 603.0149536132812 0.6969366669654846
about the columns:
those all are gfa columns.
I believe that these columns are propagated from reading the gfa files (here code version 0.63.0
, the one from jura):
https://github.com/desihub/desispec/blob/9101ccc361d431625aca833336bb835a6292d881/bin/desi_tsnr_afterburner#L1115-L1154
the EFFTIME_*
columns come from some computation based on those columns, hence getting NaN's:
https://github.com/desihub/desispec/blob/0.63.0/py/desispec/efftime.py#L13
for completeness, reading the gfa_table with the same routine, and looking for those columns for those expids:
EXPID NIGHT TRANSPARENCY FWHM_ASEC FIBER_FRACFLUX FIBER_FRACFLUX_ELG FIBER_FRACFLUX_BGS FIBERFAC FIBERFAC_ELG FIBERFAC_BGS
------ -------- -------------------- ------------------ ------------------- ------------------- -------------------- --------------------- --------------------- ----------------------
74307 20210202 0.059126717922222294 0.9789517068362219 0.6320627694913908 0.44970044302469636 0.20504407421980816 nan nan nan
79769 20210308 nan 1.8761397170576326 0.2613052900585784 0.21108856454109742 0.10411062155477979 nan nan nan
83420 20210404 nan nan nan nan nan 0.0007895831162312318 0.0006804336788210475 0.00030002771194605096
110852 20211125 0.8780836223539111 3.20778365614237 0.11836103479224255 0.10716889904754712 0.057902263978654445 nan nan nan
190752 20230815 0.04454787890451631 1.0542636639223057 0.572109104821019 0.4136182060095519 0.19035298877013784 nan nan nan
and about the FRAMES
extension of exposures-jura.fits
: I don t find any NaN's in there.
I did the following check:
d = Table(fitsio.read("/global/cfs/cdirs/desi/spectro/redux/jura/exposures-jura.fits", "FRAMES"))
for key in d.colnames:
if isinstance(d[key][0], str):
continue
sel = ~np.isfinite(d[key])
if sel.sum() > 0:
print("{}\t{}".format(key, d["EXPID"][sel].tolist()))
IIUC all NaN values come from the GFA pipeline when its outputs are merged into to the exposures table. I don't think any of these are coming from the spectro pipeline itself. Options:
In practice these GFA quantities are mostly informative and used for operations debugging since the GFAs are used for the exposure time calculator. i.e. if we're got a quick and solid fix, ok, but I don't think this should delay Kibo. Also, if we confirm that all the NaNs are coming from GFA pipeline inputs being merged into the exposures table, that could be fixed and rerun while Kibo is running, since the exposures table isn't generated until after all of the tile-based processing is done.
Thank you for contributing to this discussion. This is getting interesting. Here are some preliminary findings:
For patching these values in jura
, the best value to use is zero (0.0
). Why?
jura
, so we should just use that value.What are we actually patching?
daily
with values in jura
, then it was discovered that some values in jura
are bad.jura
, only applying a patch to daily
.Why didn't we find this before?
fuji
and iron
. However, in both fuji
and iron
the GFA quantities associated with all of the exposures for the corresponding tiles are set to zero (another reason zero is a good patch value!).iron
and jura
. The GFA file itself is the same: desi/survey/GFA/offline_matched_coadd_ccds_SV1-thru_20210928.fits
which was frozen in fuji
. There are definitely NaN values present in that file. Somehow, the interpretation of bad GFA values changed between desispec/0.56.5
(iron) and desispec/0.63.x
(jura).What should we do for kibo
?
iron
and jura
and decide (in progress). I vote for replacing NaN with zero explicitly--with logging and documentation--in kibo
.compute_tile_completeness_table()
when it sums over masked values, so we can at least know that is happening.What else?
desi_tiles_completeness
effectively does the same thing as desi_tsnr_afterburner --tile-completeness
, but does not share any code. These could diverge in the future. It is not obvious why desi_tiles_completeness
is even needed. That should be a separate ticket, of course, and is not necessarily needed for kibo
.desi_tsnr_afterburner
should be standardized. The outputs and log files are literally all over the place, which is not good practice.Follow up results:
desi_tiles_completeness
and desi_tsnr_afterburner --tile-completeness
: the former uses Table.read()
while the latter uses desispec.io.read_table()
. Table.read()
masks NaN, but read_table()
does not. If desi_tsnr_afterburner --tile-completeness
had been used for jura
, NaN would probably have shown up in the tiles-jura.(fits|csv)
files.desi_tsnr_afterburner
for iron
, the SV1 GFA file was never read in. That means effectively all SV1 exposures have no GFA data. Or equivalently, it's not the presence or absence of NaN that caused the iron GFA values to be zero, the values were never assigned in the first place. This might have also happened for fuji
but that's a slightly lower priority investigation at this point.thanks for the investigation.
if useful, a minor precision about the sv1, sv3 gfa files: the namings are a bit misleading; I believe that the sv3 file contains all exposures taken since the start of sv3 (20210405), which do include some sv1 exposures (approx. half of the sv1 exposures).
actually the offline_matched_coadd_ccds_SV3-thru_20240409.fits
file probably contains all exposures since 20210405.
>>> d1 = Table(fitsio.read("/global/cfs/cdirs/desi/survey/GFA/offline_matched_coadd_ccds_SV1-thru_20210928.fits", 2))
>>> d3 = Table(fitsio.read("/global/cfs/cdirs/desi/survey/GFA/offline_matched_coadd_ccds_SV3-thru_20240409.fits", 2))
>>> d1["NIGHT"].min(), d1["NIGHT"].max()
(20201214, 20210928)
>>> d3["NIGHT"].min(), d3["NIGHT"].max()
(20210405, 20240409)
>>> np.in1d(d1["EXPID"], d3["EXPID"]).mean()
0.48369036027263873
>>> np.all(~np.in1d(d1["EXPID"][d1["NIGHT"] < d3["NIGHT"].min()], d3["EXPID"]))
True
>>> np.all(np.in1d(d1["EXPID"][d1["NIGHT"] >= d3["NIGHT"].min()], d3["EXPID"]))
True
OK, I should have said "many early" SV1 exposures rather than "all" SV1 exposures, but the important thing here is that two GFA summary files must be read for completeness, and one of them will continue to contain the label "SV1".
We also now know the reason for the difference between iron
& jura
:
When iron
was run, it used the directory that is now called desi/survey/GFA.KPNO
, but at the time, it was just desi/survey/GFA
. That directory did not (and does not) contain a copy of a SV1 GFA summary file, so all of the exposures that appear only in the SV1-labelled GFA summary files had their values set to zero. This had nothing to do with the presence of NaN in GFA summary files at all. The necessary file just wasn't there. This also happened for fuji
.
jura
used desi/survey/GFA.NERSC
(which is the current desi/survey/GFA
via a symlink), which does contain a SV1 GFA summary file, so we then "discovered" the NaNs for exposures that had previously just been set to zero.
While working on #2251, we discovered that the top-level
exposures-jura
files, specifically the exposure tables in those files contained some NaN values.This list of exposures should be all of those that contain NaN. You can see that some of the columns are masked.
This is the list of columns in
exposures-jura
that contain NaN values:The origin of these invalid values should be investigated at high priority in preparation for kibo. As part of the investigation, we need to look into the
tsnr_afterburner
to see where these columns come from.We also need to check the
FRAMES
table for invalid values (part ofexposures-jura.fits
).Attn: @sbailey, @araichoor, @akremin et al..