LSSTDESC / DC2-production

Configuration, production, validation specifications and tools for the DC2 Data Set.
BSD 3-Clause "New" or "Revised" License
11 stars 7 forks source link

Low density on tract 4852/patch 1,5 on Run2.2i DR6 WFD #408

Closed plaszczy closed 3 years ago

plaszczy commented 3 years ago

tract_chelou

johannct commented 3 years ago

I did not find it

jchiang87 commented 3 years ago

Looking at the tracts_mapping.sqlite3 file, there are 33 visits with > 4 ccds contributing in i-band. Here they are

919561 5
945676 5
420877 5
211473 5
919601 5
665665 6
1157699 5
437323 5
1185891 5
665703 6
678509 5
685691 5
994964 5
994965 5
906938 6
995008 5
211148 5
211179 5
458497 5
204555 5
965400 5
269088 5
1000776 5
1000781 6
269134 5
1209684 5
1212791 5
434049 5
1165737 5
518059 5
1209263 5
420826 5
200191 5

with the number of CCDs contributing in the second column.

erykoff commented 3 years ago

The offender is 665703 according to the supreme input map. image

erykoff commented 3 years ago

Where do the warps live at nersc?

jchiang87 commented 3 years ago

They would be here:

/global/cfs/cdirs/lsst/production/DC2_ImSim/Run2.2i/desc_dm_drp/v19.0.0-v1/rerun/run2.2i-coadd-wfd-dr6-v1-grizy/deepCoadd/i/4852/1,5

but that folder is empty. The originals, if they are still around, would be at CC-IN2P3.

erykoff commented 3 years ago

I thought that @heather999 put them somewhere else? Or maybe I'm mis-remembering. But if we don't have the warps at nersc, I can't do anything else at the moment, and hopefully @johannct will notice something strange with 665703.

johannct commented 3 years ago

ds9

erykoff commented 3 years ago

Seems kind of empty, but maybe the scale ... does it look strange compared to another visit warp?

johannct commented 3 years ago

not really.....

jchiang87 commented 3 years ago

fwiw, here's the tracts_mapping info on that warp:

          id  tract   patch   visit  detector filter layer
194  6720042   4852  (1, 5)  665703       133      i      
195  6720045   4852  (1, 5)  665703       134      i      
196  6720081   4852  (1, 5)  665703       141      i      
197  6720201   4852  (1, 5)  665703       163      i      
198  6720205   4852  (1, 5)  665703       164      i      
199  6720245   4852  (1, 5)  665703       171      i      

Having looked at a few of these calexps, thing seem fine, but I only looked at the image data.

/global/cfs/cdirs/lsst/production/DC2_ImSim/Run2.2i/desc_dm_drp/v19.0.0-v1/rerun/run2.2i-calexp-v1/calexp/00665703-i
jchiang87 commented 3 years ago

@johannct Which SRS log file corresponds to the creation of that warp? There may be clues there.

johannct commented 3 years ago

/sps/lsst/users/descprod/Pipeline2/Logs/DC2DM_DRP/2.9/task_coadd/task_coadd_tract_patch/task_coaddDriver/run_coaddDriver/024/043/007/001/logFile.txt

wmwv commented 3 years ago

But if we don't have the warps at nersc,

We explicitly decided that we didn't need to keep these intermediate products at NERSC.

erykoff commented 3 years ago

Right! We have the calexps somewhere but not the warps (or the wasps for that matter). @johannct can you put the warps for 665703 and, say, 174550 some place on nersc scratch so that I can poke at them?

johannct commented 3 years ago

both in /global/cscratch1/sd/erykoff/johannct for others than Eli who want to look.

erykoff commented 3 years ago

Will need the mode changed to 666 and we're good to go. Thanks!

wmwv commented 3 years ago

The offender is 665703 according to the supreme input map.

Was this statement based just on a visual recognition of the pattern?

erykoff commented 3 years ago

I looked at the pattern of all of the input visits and that was the only one that matched.

wmwv commented 3 years ago

Hmmm... I ask because, as you're likely similarly looking at right now, there doesn't seem to be anything obviously wrong with that warp.

erykoff commented 3 years ago

The weight map pattern looks very different between the two. Don't know which would be "correct" I've never looked at a warp weight map, nor do I know how this could cause a problem...

wmwv commented 3 years ago

The weight map pattern looks very different between the two.

Do you mean at the pixel level, or the moire pattern when zoomed out and resampled?

erykoff commented 3 years ago

Oh I have an idea. There was a bug in the coadd code that was fixed I'm pretty sure after this processing that the psf matched warps would screw up the scaling if the coadd had a bad psf. And how did it have a bad psf? It was choosing the center of the warp or something like that. And there's no psf model at the center of the warp for 665703.

wmwv commented 3 years ago

Hmm... interesting.

The psfMatchedWarp images look to have the same scaling here.

erykoff commented 3 years ago

image

jchiang87 commented 3 years ago

@erykoff Very interesting! Though I'd be surprised if this was the only warp in all of DC2 where this occurred.

wmwv commented 3 years ago

But you're saying that there's some step that asks about the PSF model of the warp and that selects the center to be typical. So the bug is after the psfMwatchedWarp generation and part of the combining into the coadd?

wmwv commented 3 years ago

Okay, completely spit-balling here. But maybe the bug is as @erykoff identified above, then combined with the mask setting having some bug where the fractional threshold to propagate a mask pixel to the coadd mask value behavior differently if there are 5 images or more.

erykoff commented 3 years ago

Ah, I was wrong, it's the other way around: https://lsstc.slack.com/archives/C2JPXB4HG/p1590785611140900 . The problem is that when making the coadd psf for detection things can go into bad regions. But this is something different, but I fear it might be related. But then again, you'd think there would be other warps where this happened! So I think it's a coincidence.

wmwv commented 3 years ago

Because I agree with @jchiang87 's point that the number of times one gets the PSF in the center to not be a valid PSF should go as something like the fractional area of the chip gaps.

erykoff commented 3 years ago

The 5 images or more is in terms of the stacked depth in the coadd and there are >> 5 images everywhere here.

wmwv commented 3 years ago

Oh, sorry, I'm an idiot. I totally misunderstood @jchiang87 table above. He was counting number of CCDs contributing to the warp for a given visit. That makes sense.

wmwv commented 3 years ago

@erykoff What does the PSF model look like for those warps?

I have great confidence in your ability to assess whether nor not a given PSF model is good. But if it's otherwise useful, I grabbed a few more warps and put them on NERSC in /global/cscratch1/sd/wmwv/DC2_Run2.2i/debug_bad_warp if you want to compare.

(These were just some that I had grabbed earlier and was looking through.)

erykoff commented 3 years ago

I have no confidence in my ability to make images of stack psf models...but I can say that the size of the psf for the bad image is smaller? But now I'm totally fishing and nothing is obviously fishy.

jchiang87 commented 3 years ago

Here are images of the psfs from the calexps for all six sensor-visits contributing to that warp: 665703_4852_1,5_psfs These are evaluated at the center of the ccd. And the output of psf.computeShape() for each:

133 (ixx=1.99657594373448, iyy=2.050492953217776, ixy=-0.011098628584589033)
134 (ixx=2.019505530391281, iyy=2.101699763461495, ixy=-0.006890655147915128)
141 (ixx=2.029050756326062, iyy=2.083832430512826, ixy=-0.0032894101846721816)
163 (ixx=1.9745718082384986, iyy=2.053081621210872, ixy=-0.005349000670642242)
164 (ixx=2.0043181345859926, iyy=2.057558593012063, ixy=-0.002385396921082846)
171 (ixx=2.0112337378367817, iyy=2.1142107505264067, ixy=0.0011369217832696897)

The first column is the detector number. The values all look similar to other CCDs in this visit.

johannct commented 3 years ago

coaddDriver finished: [tanugi@cca001 v19.0.0-v1]$ ls -ltr rerun/run2.2i-coadd-wfd-dr6-v1-grizy/deepCoadd/i/4852/1,5.fits -rw-rw-r-- 1 descprod lsst 210792960 May 7 2020 rerun/run2.2i-coadd-wfd-dr6-v1-grizy/deepCoadd/i/4852/1,5.fitsSAVE -rw-rw-r-- 1 descprod lsst 3476160 May 7 2020 rerun/run2.2i-coadd-wfd-dr6-v1-grizy/deepCoadd/i/4852/1,5_nImage.fitsSAVE -rw-rw-r-- 1 descprod lsst 210816000 Dec 10 17:37 rerun/run2.2i-coadd-wfd-dr6-v1-grizy/deepCoadd/i/4852/1,5.fits -rw-rw-r-- 1 descprod lsst 2799360 Dec 10 17:37 rerun/run2.2i-coadd-wfd-dr6-v1-grizy/deepCoadd/i/4852/1,5_nImage.fits [tanugi@cca001 v19.0.0-v1]$ ls -ltr rerun/run2.2i-coadd-wfd-dr6-v1-grizy/deepCoadd-results/i/4852/1,5 total 519708 -rw-rw-r-- 1 descprod lsst 20160 May 7 2020 bkgd-i-4852-1,5.fitsSAVE -rw-rw-r-- 1 descprod lsst 8994240 May 7 2020 det-i-4852-1,5.fitsSAVE -rw-rw-r-- 1 descprod lsst 211633920 May 7 2020 calexp-i-4852-1,5.fitsSAVE -rw-rw-r-- 1 descprod lsst 20160 Dec 10 17:52 bkgd-i-4852-1,5.fits -rw-rw-r-- 1 descprod lsst 9048960 Dec 10 17:52 det-i-4852-1,5.fits -rw-rw-r-- 1 descprod lsst 211645440 Dec 10 17:52 calexp-i-4852-1,5.fits

So the new files are different....

erykoff commented 3 years ago

Can you shoot the deepCoadd file over to nersc?

johannct commented 3 years ago

done

erykoff commented 3 years ago

So ... the new 1,5.fits that @johannct just processed looks perfectly fine in the mask plane.

erykoff commented 3 years ago

Here's what it looks like. So was this some sort of transient processing failure? How can we determine that? image

johannct commented 3 years ago

ok....... the only thing clear is that the dates are inconsistent : warps timestamps are after coadd timestamps. The log does not seem to help at all.... This is bad.

johannct commented 3 years ago

I'll move to multiband so it is running during the night CET. I will rename the current outputs as I did for coaddDriver

plaszczy commented 3 years ago

I missed the conclusion (if any). was it a system glitch?

johannct commented 3 years ago

Looks like. Of an unknown kind. @heather999 1,5 is ready at CC for pickup.

plaszczy commented 3 years ago

I checked all the patches, it seems only this one was affected. can we close the issue?

johannct commented 3 years ago

not before the reprocessed patch is analyzed :)

plaszczy commented 3 years ago

to start well the year, here is the famous before/after plot. nice job. bad_v1 good_v2

heather999 commented 3 years ago

Are we willing to use the above as sufficient validation so we can move ahead and make a DR6-v2 release?

johannct commented 3 years ago

I think so

plaszczy commented 3 years ago

I'd like to run a few additional checks that I usually report in https://lsst.lal.in2p3.fr/lalwiki/LSS/Run22 but that I did not performed in v1 since I got stuck on this patch problem.

plaszczy commented 3 years ago

I (mildly) tortured the new data and can't find any obvious flaw, so on my side its OK for release