desihub / desispec

DESI spectral pipeline
BSD 3-Clause "New" or "Revised" License
36 stars 24 forks source link

Fibers yielding unusual redshift distributions in Y1 himalayas data #1946

Open rongpu opened 1 year ago

rongpu commented 1 year ago

Edit: the status of the fixes is listed here: https://github.com/desihub/desispec/issues/1946#issuecomment-1370388423

Previously I had identified fibers that yield redshift distributions that are not consistent with the overall redshift distribution (based on K-S tests). I posted the details here on the #spectroscopic-systematics slack channel.

To help identify the causes of the bad redshifts, here I'm going through each problematic fiber, providing an example spectrum and the corresponding coadd file path.

So far I have only gone through the bad fibers identified from LRG and BGS observations. The LRG bad fibers are: 1008 1098 1219 1251 2675 2676 2678 2679 2680 3994 3995 4349 4720 A few more fibers are not flagged in LRG observations but are flagged in BGS; they are 466 2250 2251 2252 2253 3038

Starting with the LRG bad fibers:

FIBER 1008 p-value = 2.2558e-139 n_tot = 564 (n_tot is the total number of LRG observations by that fiber) image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/9195/20211128/coadd-2-9195-thru20211128.fits image

FIBER 1098 p-value = 0.00014002 n_tot = 593 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/1517/20211031/coadd-2-1517-thru20211031.fits image

FIBER 1219 p-value = 4.7286e-25 n_tot = 591 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/10791/20220429/coadd-2-10791-thru20220429.fits image

FIBER 1251 p-value = 2.534e-44 n_tot = 652 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/10376/20211212/coadd-2-10376-thru20211212.fits image

Fibers 2675-2680 appear to be cause by the same problem. Here I'm only giving an example for FIBER 2675. FIBER 2675 p-value = 3.2445e-12 n_tot = 139 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/9360/20211006/coadd-5-9360-thru20211006.fits image

FIBER 3994 p-value = 4.331e-42 n_tot = 639 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/4703/20220125/coadd-7-4703-thru20220125.fits image

FIBER 3995 p-value = 0.0005322 n_tot = 595 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/3028/20220207/coadd-7-3028-thru20220207.fits image

FIBER 4349 p-value = 2.9092e-30 n_tot = 611 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/2338/20211029/coadd-8-2338-thru20211029.fits image

FIBER 4720 p-value = 8.8502e-05 n_tot = 442 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/1650/20220219/coadd-9-1650-thru20220219.fits image

Problematic fibers that are only identified from BGS redshift distributions:

FIBER 466 p-value = 1.3478e-09 n_tot = 969 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/26253/20211020/coadd-0-26253-thru20211020.fits image

Fibers 2250-2253 appear to have the same issue.

FIBER 2250 p-value = 0.00013892 n_tot = 503 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/21308/20211022/coadd-4-21308-thru20211022.fits image

FIBER 2251 p-value = 0.00054949 n_tot = 520 /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/20040/20210628/coadd-4-20040-thru20210628.fits image

FIBER 2252 p-value = 2.3389e-06 n_tot = 505 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/20915/20210523/coadd-4-20915-thru20210523.fits image

FIBER 2253 p-value = 0.00016885 n_tot = 521 /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/24451/20211022/coadd-4-24451-thru20211022.fits image

FIBER 3038 p-value = 6.2344e-06 n_tot = 1524 image /global/cfs/cdirs/desi/spectro/redux/himalayas/tiles/cumulative/26213/20220211/coadd-6-26213-thru20220211.fits image

sbailey commented 1 year ago

These are very useful, @rongpu, thanks! Please try making them as redshift vs. time to see if the pileup of problematic redshifts occurs on individual non-contiguous nights, or time periods of mostly contiguous nights, or is spread out over the entire survey.

rongpu commented 1 year ago

Thanks @sbailey. The occurrence of bad redshifts vs time would be one of the things that we will examine when we look into each individual case. I'm hoping that for many of those cases we can identify the problems and find fixes for them in iron, rather than having to throw out these fibers (either altogether or in some time window).

ashleyjross commented 1 year ago

I looked at a few cases vs. time. While I see some time dependence, they all look like outliers for most of Y1. Plots:

Screen Shot 2022-12-22 at 12 22 37 PM Screen Shot 2022-12-22 at 12 22 23 PM Screen Shot 2022-12-22 at 12 21 52 PM Screen Shot 2022-12-22 at 12 20 10 PM Screen Shot 2022-12-22 at 12 19 49 PM

notebook: /global/cfs/cdirs/desi/survey/catalogs/Y1/LSS/himalayas/outliers_nights.ipynb

julienguy commented 1 year ago

Fiber 1008 Study with Rongpu. Identified incorrect pixel flat field correction. It's corrected in the fiber flat but because of trace coordinates shifts this flat correction is not valid for all of the science observations and can result in a peak or a dip in the spectra. Solution: mask out with desispec.maskbits.ccdmask.LOWFLAT the offending pixels in the file pixmask-sm5-z2-20210901.fits.

Screenshot from 2022-12-22 09-45-06

With corrected pixmask (bottom left image):

Screenshot from 2022-12-22 09-57-58

Correction committed in calib SVN.

julienguy commented 1 year ago

Fibers 2675-2680 It's a known issue in z5. You can decide in the LSS analysis to discard those fibers. There is no way to fix the past data. We have however replaced this CCD. The report is there: https://desi.lbl.gov/trac/raw-attachment/wiki/InstOpsPages/InstOpsMeetings/2022-11-14/DESI-CCD-replacement-results-20221114.pdf

sbailey commented 1 year ago

@julienguy and @rongpu beat me to it on the diagnosis of fiber 1008, but FYI this is the kind of redshift vs. time plot that I was thinking:

from datetime import datetime
import numpy as np
import fitsio
import matplotlib.pylab as plt

zcatalog = fitsio.read('ztile-main-dark-cumulative.fits', 'ZCATALOG',
        columns=('Z', 'ZWARN', 'TILEID', 'FIBER', 'LASTNIGHT', 'DESI_TARGET'))

fiber = 1251
zcat = zcatalog[zcatalog['FIBER'] == fiber]
dates = np.asarray([datetime.strptime(str(night), '%Y%m%d') for night in zcat['LASTNIGHT']])
isLRG = (zcat['DESI_TARGET'] & 1) != 0

plt.plot(dates[isLRG], zcat['Z'][isLRG], '.', alpha=0.2)
plt.xlabel('date'); plt.ylabel('Redshift')
plt.title(f'Fiber {fiber} LRGs')
plt.show()

image

In all of the cases I've spot checked, the problem appears after the summer 2021 shutdown as isn't isolated to a few nights or a small time range. Himalayas preprocessing was run before we had @Waelthus new monthly darks, so if the problems are in the dark model itself they may already be fixed, but testing is a non-trivial amount of processing so its good to chase down individual cases and fix things like the pixmask (fiber 1008 issue) first.

julienguy commented 1 year ago

Fiber 3994-3995

Bad (parallel) charge transfer in CCD row x=3922 (starting at 0) in amplifier D of CCD r7. Fixed by flagging pixels as bad in pixmask-sm8-r7-20210901.fits.gz. Committed to SVN. This affects both fibers. We need to rerun to check whether we can use the fibers of not (now that a column is masked).

See the trails in the 7th fiber from the left in the following plot

Screenshot from 2022-12-22 10-29-42

julienguy commented 1 year ago

Fiber 4349

Dead pixel at x=2752 y=3472 and for all y>3472 in amplifier D of b8. Fixed by flagging the bad pixels in pixmask-sm2-b8-20210901.fits.gz. Committed to SVN. Easier to see in the flat field exposures preproc:

Screenshot from 2022-12-22 10-55-27

rongpu commented 1 year ago

Thanks @ashleyjross, @sbailey. Here I'm plotting redshift vs date for all the flagged fibers using Stephen's script.

None issues are isolated to a few nights. For some fibers (including 1008, 1251, 1219, 3038, 4349), there are fewer pileups before the 2021 summer shutdown, although it might not necessarily mean that the pre-shutdown data is glitch-free or that their redshifts are correct.

FIBER 1008 image

FIBER 1098 image

FIBER 1219 image

FIBER 1251 image

FIBER 2675 image

FIBER 2676 image

FIBER 2678 image

FIBER 2679 image

FIBER 2680 image

FIBER 3994 image

FIBER 3995 image

FIBER 4349 image

FIBER 4720 image

Below are fibers flagged only in BGS: FIBER 466 image

FIBER 2250 image

FIBER 2251 image

FIBER 2252 image

FIBER 2253 image

FIBER 3038 image

rongpu commented 1 year ago

To keep track of progress, here is the status of the investigation into the problematic fibers. I will keep editing this post with updates.

Fixed or not fixable: 1008: fixed (incorrect pixel flat field correction) 1098: related to the sky subtraction; fixed by @schlafly in #1960 (?) 1219: parallel CTE; fixed (see #1956) 1251: parallel CTE; fixed (see #1956) 2250-2253: possible explanation: CTE PCA correction on CTE-free exposures; fixed by @schlafly in #1960 2675-2680: known issue; cannot be fixed 3994-3995: fixed (bad parallel charge transfer) 4349: fixed (dead pixel) 4720: fixed (bad parallel charge transfer)

No fix yet: 466: 3038: possibly stray light in the single fiber?! not fixed yet

rongpu commented 1 year ago

FIBER 4720 Looking into this fiber with Julien, we found bad (parallel) charge transfer in CCD row x=1812 (0-based indexing) in amplifier C of CCD r9. I fixed this by masking this column from y=2064 to 3634 (inclusive) in pixmask-sm3-r9-20210901.fits.gz. The change have been committed in SVN.

Left: science exposure; right: arc exposure.

Screen Shot 2023-01-03 at 4 54 05 PM
julienguy commented 1 year ago

Fiber 466 I looked at the blue channel. I do not see anything wrong at the CCD level. The PSF is OK. It is a fiber with a rather low transmission (20-30% lower than average). I will add to this message if I find something.

rongpu commented 1 year ago

Fibers 2250-2253 These fibers, as well as several fibers immediately after, show significant sky line residuals in himalayas, but not in daily (see plot below). Julien thinks it may be caused by the PCA correction for the CTE issue. These fibers started to have CTE issues sometime after that exposure was taken, and the PCA correction was applied to all data including the CTE-free data in himalayas.

Edit: forgot to mention, this is r4.

Left: sframe in himalayas; right: sframe in daily; EXPID= 95297

Screen Shot 2023-01-04 at 1 20 49 PM
ashleyjross commented 1 year ago

@schlafly is the same issue as the r4 one we've been discussing on Slack for the data since r4 got fixed?

rongpu commented 1 year ago

FIBER 1219 and FIBER 1251 CCD r2 has three columns with bad parallel charge transfer.

I have not updated the pixmask yet. We might want to do a systematic search of these parallel CTE issues in all CCDs –– it is quite easy to spot them from arc exposures

Arc exposure EXPID=113356

Screen Shot 2023-01-04 at 1 42 45 PM 1
julienguy commented 1 year ago

About Fibers 2250-2253, it seems that the dominant effect is on the 'tpcorr' correction. Example (where I remove the 'sky-line-throughput-correction' that is now in place to show more clearly the effect):

desi_compute_sky -i /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20210619/00095297/frame-r4-00095297.fits.gz --fiberflat /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20210619/00095297/fiberflatexp-r4-00095297.fits.gz  --adjust-wavelength --adjust-lsf --pca-corr /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm1/skycorr-pca-sm1-r4.fits --fit-offsets --skygradpca /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm1/skygradpca-sm1-r4.fits --tpcorrparam /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm1/tpcorrparam-sm1-r4.fits -o sky-r4-00095297-std.fits.gz

desi_process_exposure --infile /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20210619/00095297/frame-r4-00095297.fits.gz --fiberflat /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20210619/00095297/fiberflatexp-r4-00095297.fits.gz  --cosmics-nsig 6 --sky sky-r4-00095297-std.fits.gz --no-sky-line-throughput-correction --no-xtalk --outfile sframe-r4-00095297-std.fits.gz

desi_compute_sky -i /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20210619/00095297/frame-r4-00095297.fits.gz --fiberflat /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20210619/00095297/fiberflatexp-r4-00095297.fits.gz  --adjust-wavelength --adjust-lsf --pca-corr /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm1/skycorr-pca-sm1-r4.fits --fit-offsets  --skygradpca /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm1/skygradpca-sm1-r4.fits  -o sky-r4-00095297-nocorr.fits.gz

desi_process_exposure --infile /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20210619/00095297/frame-r4-00095297.fits.gz --fiberflat /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20210619/00095297/fiberflatexp-r4-00095297.fits.gz  --cosmics-nsig 6 --sky sky-r4-00095297-nocorr.fits.gz  --no-sky-line-throughput-correction --no-xtalk --outfile sframe-r4-00095297-nocorr.fits.gz

plot_frame -i sframe-r4-00095297-{std,nocorr}.fits.gz --fibers 2253 --legend

Screenshot from 2023-01-04 13-56-10

julienguy commented 1 year ago

Fiber 1098

Again a tpcorr issue.

plot_frame -i ~/redux/himalayas/exposures/20211031/00106864/sframe-{b,r}2-00106864.fits.gz --fibers 97,98,99 --rebin 16 --legend

Screenshot from 2023-01-04 14-29-29

Recomputing the sky model without tpcorr

desi_compute_sky -i /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20211031/00106864/frame-b2-00106864.fits.gz --fiberflat /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20211031/00106864/fiberflatexp-b2-00106864.fits.gz  --adjust-wavelength --adjust-lsf --pca-corr /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm5/skycorr-pca-sm5-b2.fits --fit-offsets  --skygradpca /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm5/skygradpca-sm5-b2.fits  -o sky-b2-00106864-nocorr.fits.gz

desi_process_exposure --infile /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20211031/00106864/frame-b2-00106864.fits.gz --fiberflat /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20211031/00106864/fiberflatexp-b2-00106864.fits.gz  --cosmics-nsig 6 --sky sky-b2-00106864-nocorr.fits.gz  --no-sky-line-throughput-correction --no-xtalk --outfile sframe-b2-00106864-nocorr.fits.gz

plot_frame -i sframe-b2-00106864-nocorr.fits.gz --fibers 97,98,99 --rebin 16 --legend

Screenshot from 2023-01-04 14-31-45

Rerunning with the tpcorr option gives some systematic offset, but smaller than the results from himalayas. So something else has changed since then.

desi_compute_sky -i /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20211031/00106864/frame-b2-00106864.fits.gz --fiberflat /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20211031/00106864/fiberflatexp-b2-00106864.fits.gz  --adjust-wavelength --adjust-lsf --pca-corr /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm5/skycorr-pca-sm5-b2.fits --fit-offsets  --skygradpca /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm5/skygradpca-sm5-b2.fits --tpcorrparam /global/cfs/cdirs/desi/spectro/desi_spectro_calib/trunk/spec/sm5/tpcorrparam-sm5-b2.fits -o sky-b2-00106864-std.fits.gz

desi_process_exposure --infile /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20211031/00106864/frame-b2-00106864.fits.gz --fiberflat /global/cfs/cdirs/desi/spectro/redux/himalayas/exposures/20211031/00106864/fiberflatexp-b2-00106864.fits.gz  --cosmics-nsig 6 --sky sky-b2-00106864-std.fits.gz  --no-sky-line-throughput-correction --no-xtalk --outfile sframe-b2-00106864-std.fits.gz

plot_frame -i sframe-b2-00106864-std.fits.gz --fibers 97,98,99 --rebin 16 --legend

Screenshot from 2023-01-04 14-37-11

julienguy commented 1 year ago

Fiber 3038

We see a dip in the sky subtracted frames for bright time exposures. We strongly suspect that it is due to incorrect fiber flat fielding that causes an over-subtraction of the sky at wavelength around 5880A. It would be caused by light contamination during the dome flat exposures specifically for this unique fiber which seems bizarre but this is the only explanation we have.

From left to right, preproc of one arc lamp exposure ( /global/cfs/cdirs/desi/spectro/redux/himalayas/preproc/20220211/00122115/preproc-r6-00122115.fits.gz) (left) , extracted spectra from that exposure with fiber 3038 in red (right), extracted spectra from a continuum lamp exposure (center) during the same day.

Screenshot from 2023-01-04 15-48-54

The excess flux is consistent with the ratio of exposure times (5sec and 120sec). We don't see anything in bias and dark. This excess causes an overestimation of the fiber transmission by 20% around 5880A and we do see a sky over-correction by that amount. See frame and sframe below:

Screenshot from 2023-01-04 16-00-16

dkirkby commented 1 year ago

Fiber 3038 is connected to robot M01480 and located in hole 434 of petal_loc 6 (petalid 11). Its neighbors are M01382,1106,1926,8347,6300,6602 (2 disabled robots in bold italics_).

julienguy commented 1 year ago

We see this spurious signal also in the blue camera because the wavelength of 5880A is in the dichroic transition range so it is indeed light that goes through the fiber. Screenshot from 2023-01-05 09-21-32

julienguy commented 1 year ago

On the focal plane, fiber 3038 is near the edge of the focal plane. Its 6 neighbors are fibers [3011, 3027, 3030, 3379, 3383, 3394]. Figure_1

julienguy commented 1 year ago

The signal is seen only in fiber 3038. Screenshot from 2023-01-05 10-07-33

julienguy commented 1 year ago

Now that we know what we are looking for, at other dates, it's visible in other isolated fibers. And one can see this pollution in nightwatch. See for instance https://nightwatch.desi.lbl.gov/20220530/00137320/preproc-r8-00137320-4x.html .

rongpu commented 1 year ago

I have rerun this N(z) test on the iron zcatalogs. The results for the previously flagged fibers (still flagged fibers are in bold):

466: still flagged in BGS N(z) (expected as no fix was implemented) 1008: still flagged (unexpected since a fix was implemented) 1098: fixed and no longer flagged 1219: fixed and no longer flagged 1251: fixed and no longer flagged 2250-2253: fixed and no longer flagged 2675-2680: fibers 2675, 2676 and 2678 are still flagged in LRG N(z) (expected as no fix was implemented) 3038: still flagged (expected as no fix was implemented) 3994: still flagged (unexpected since a fix was implemented) 3995: fixed and no longer flagged 4349: fixed and no longer flagged 4720: no longer flagged in LRG N(z) by still flagged in BGS N(z) (althought it's a marginal case with pvalue of 3e-5)

Two fibers that appeared normal in himalayas are flagged in iron:

4321: newly flagged fiber by LRG N(z) (many with z=~0.5 and they only appeared between 2021-10 and 2022-01) 4891: newly flagged fiber by BGS N(z)

Based on the N(z) test, we may need to mask the following fibers from the iron LSS catalog:

Below are redshift vs date plots for some of the still flagged fibers in iron:

Fiber 466 image

Fiber 1008 image

Fiber 2676 image

Fiber 3038 image

Fiber 3994 image

Fiber 4321 image

Fiber 4891 image

I have also updated the redshift vs fiber plots for iron, and many improvements (not just those implemented here) are visible compared to himalayas.