Document including the features/issues found in validation for the different runs

fjaviersanchez commented 6 years ago

I am opening this issue to track the progress in documenting the features and issues found in 1.1p, 1.2i/p and 2.0i. I will add a document (still to decide whether to be a link in the README in the Documents folder or an actual document there).

This is the summary of features/issues found during the validation:

Run 1.1p: Issues with background (too much structure) created strange features in calexps. We thought that this prevented the input and output magnitudes to match. The issue had to do with optimization settings for the background (sky) photons. Extinction (both from the Milky Way and internal) was not applied to sources.
Run 1.2p: Background now correct, this allowed us to find a mismatch between inputs and outputs. When we compared 1.2p outputs with 1.2i's we found that the number of detected sources was way larger in 1.2p than in 1.2i. After some digging, we found that extinction was not being applied in PhoSim (this was fixed in PhoSim after that). The PSF size/ellipticity tests show that the PSF ellipticity is slightly larger than we think it sould be (@rmandelb and @rmjarvis can elaborate more, also check #259). Photometry is correct if compared to reference catalog (see note 1) but biased if comparing with true catalog. Astrometry looks correct. Colors are wrong compared to protoDC2 due to bug while translating to instance catalogs.
Run 1.2i: Good photometric and astrometric qualities (comparing to the reference catalog). PSF was too round/nice. Biased photometry if comparing to truth catalog (see note 1). Colors are wrong compared to protoDC2 due to the aforementioned bug while translating to instance catalogs.
Run 2.0i: Good photometric and astrometric qualities comparing to reference catalog (no truth catalog yet). Increased ellipticity in PSF. Colors are still wrong compared to cosmoDC2 (we found the bug while running 2.0).
Note 1 (reference catalog vs truth catalog): The reference catalog was created directly from protoDC2/cosmoDC2 magnitudes and this was used to get the magnitude zeropoints and calibrate the sources (in the DM-stack measurements). When we compare the measured magnitudes to the magnitudes in these reference catalogs the distributions are centered at zero (except for the case of 1.1p), however, when compared to the truth catalog they look biased. This is because when we translate the magnitudes from protoDC2/cosmoDC2 to the instance catalogs we find the best fit SED in the CatSim library, however, the normalization chosen seems to produce some mismatch in color space (more details on this in @danielsf's DC2 presentations). Several solutions proposed and there's ongoing work to mitigate this.

Some other details if you are interested in the full story behind DC2 (DC2 collector's-edition):

Note 2 (reprocessings): After the first processing of 1.2i, we noticed that there were some astrometric and photometric biases, this turned out to be caused by the galaxies entering the reference catalogs. After that, we changed the settings for processing and reprocessed both 1.2i and 1.2p. The astrometry and photometry biases disappeared (comparing to the reference).
Check #259, #278 for validation in 1.2i/p and 2.0i. Also check the notebooks here if you want to take a look at some results/use these pieces of code for your analyses.

yymao commented 6 years ago

This note is great! Thanks, Javier!

rmjarvis commented 6 years ago

I think this is all correct. Thanks, Javi, for documenting it.

boutigny commented 6 years ago

Thanks Javier, this is very useful. I have one comment regarding the astrometry: I don't think that we changed anything to use only stars, I don't think that it is even possible at the configuration level. A change has been applied for the photometry, this is this entry in the config file: config.calibrate.photoCal.match.sourceSelection.unresolved.maximum=0.5 But for the astrometry I am pretty sure that we are still using stars and galaxies from the reference catalog. The only change which has been done by @danielsf at an early stage of the processing was to limit the magnitude of the objects in the reference catalog to 23.

fjaviersanchez commented 6 years ago

@boutigny thanks a lot for the clarification!

cwwalter commented 6 years ago

Is there a reference for: "Colors are wrong compared to protoDC2 due to bug while translating to instance catalogs." so we can document it?

fjaviersanchez commented 6 years ago

I found this presentation and this other presentation. I think there's more documentation elsewhere but I can't remember where, @danielsf may have something.

cwwalter commented 6 years ago

Oh... I was thinking you were saying that this was a different problem that we solved with the top hat since it is listed in the 1.2 section.

I remembered we determined that with 2.0. But I missed your "Colors are still wrong compared to cosmoDC2 (we found the bug while running 2.0)." So got it now. Thanks.

fjaviersanchez commented 5 years ago

Here is a great notebook by @plaszczy illustrating some of the features in 1.2p. Thanks a lot for putting this together.

fjaviersanchez commented 5 years ago

Update (9/12): Preliminary tests on Run2.1i

Visit 1918527 r-band, Sky brightness: 20.43896 (r-band), Moon: 33 deg, Phase: 69% (100% full Moon)

Density: 10.2 galaxies/sq-arcmin. Depth: 23.5

Measured magnitude for objects used for photometric calibration (calib_photometry_used==True) and matched object in the reference catalog:

Kron photometry (all objects):

Astrometric residuals for objects with calib_astrometry_used==True:

Astrometry (all objects):

Detection efficiency for stars with |mag_PSF - mag_true| < 0.1 and detected within 5 pixels of their true centroid:

PSF size:

Checking for brighter-fatter:

Depth distribution across the focal plane:

PSF ellipticity distribution across the focal plane (median and 95-th percentile marked with broken lines):

I have to repeat these checks with the results for sip order = 2

rmjarvis commented 5 years ago

Astrometric residuals for objects with calib_astrometry_used==True:

Question about this test: Are you applying the cos(dec) factor to the RA residuals?

The distance on the sky between two objects with the same Dec, but a small Delta RA is (Delta RA) * cos(Dec). So if you are currently just reporting the Delta RA value, then it's probably worth multiplying by cos(Dec) to turn this into an angle on the sky. Otherwise, exposures closer to the poles will seem to have worse astrometry in the RA direction, when it's just a measurement artifact of using RA, Dec coordinates.

fjaviersanchez commented 5 years ago

Thanks a @rmjarvis. I wasn't including this. However, I thought I needed to multiply times sin(dec)?

This is what I get using sin(dec):

And this is using cos(dec):

rmjarvis commented 5 years ago

No, it should be cos(dec). At the equator, there is no effect. At the poles, small angles have large delta RA.

It's certainly possible that the astrometry residuals aren't the same in RA and Dec. So I don't think it's indicative of a problem that the blue and orange aren't precisely on top of each other even after applying the cos(dec) factor.

fjaviersanchez commented 5 years ago

Update 12/10:

I also checked the u-band pre-2.1i images and they look good to me:

Number density (~6.8 galaxies/sq-arcmin). Photometry looks good, astrometry is almost unbiased (RA shows a median difference of 1.5 mas with respect to the reference catalog but this is probably due to statistics and the matching method).

I am showing the same plots as above:

Photometry for calibration objects As a function of magnitude Astrometry for calibration objects (the plots above and below contains the factor cos(dec)) For all objects PSF size:

Looking for BF effect:

Detection efficiency for stars:

Depth:

PSF ellipticity:

One of the critical things to check are the interactions between the different types of objects (as rendered by imSim). I'll keep working on this.

fjaviersanchez commented 5 years ago

Update 12/12:

I started checking the centroid file. I matched the objects in the output catalog to the closest neighbor in the centroid file with flags!=1 (which means skipped). This is a very crude approximation since the centroid files include more than one component, however, I assumed that the dominant component in the object will be the one that its centroid lies closest to the measured centroid.

With that in mind, I tried to check some quantities classifying by flags. The flags are the following:

0 -> No flags 2 -> Simple SED 4 -> No silicon 8 -> FFT rendered

If an object has more than one flag, they are added up, e.g, an object that doesn't use the silicon model and has a simple sed has flags=6. An object that has been FFT rendered and doesn't use the silicon model has flags=12, and so on (there are very few of these because these are the brightest).

I checked the magnitude distribution for the available r-band visit:

This looks reasonable to me, the objects with flags=0 or 2 are overall brighter than those that use flags=6 (simple_sed+no_silicon).

I also checked the ellipticity distributions and they all look correct:

These are the HSM PSF-corrected regauss e1,e2 from the stack. @rmjarvis @cwwalter is there anything else you'd like to see?

I also tried to match every truth component but the histograms were dominated by the "by-chance" matches (since the overall density in the instance catalogs is way higher than the density in the output single-visit catalogs). I can try some smarter ways to match in order to get more accurate results as well but I think this was a good quick and dirty approximation to get an overall view. Comments and requests are welcome 🙂

cwwalter commented 5 years ago

Thanks Javier!

Can you explain the magnitude plot more?

Is the y-axis a fractional amount of something? I can't understand why

The numbers of objects are the same for all classes
The distribution for "normal" goes all the way down to 26 and the distribution for dim simple sed objects goes all the way out to 16. I must not be understanding something about the plotting...

A couple of other ideas:

The objects IDs are actually combined bit patters with the lower 10 bits being the object type and the upper bits being the uniqueID. So using this info you can actually combine objects if you want.
Also it would be interesting to make a "truth" plot. Using only the centroid file you can plot the number of each class as a function of magnitude (including skipped ones).

cwwalter commented 5 years ago

For the plots from yesterday, is the BF as large as we saw it before? (There was a small change in the GalSim 2.1's diffusion model so would like to see there was no big change).

fjaviersanchez commented 5 years ago

Is the y-axis a fractional amount of something? I can't understand why

Sorry, I should have specified that. I normalized all the histograms to check that the distributions had reasonable shapes.

The distribution for "normal" goes all the way down to 26 and the distribution for dim simple sed objects goes all the way out to 16. I must not be understanding something about the plotting...

The objects at 26 may be bad matches, I can check the differences in total flux as well just to make sure. I would say also the same for the objects at mag=16. I assume that a good fraction of them should be faint objects that aren't detected and overlap with some brighter object but I have to check that.

The objects IDs are actually combined bit patters with the lower 10 bits being the object type and the upper bits being the uniqueID. So using this info you can actually combine objects if you want.

I was thinking of using these but the flags may be different for the different components and I didn't know which approach would be better (I can probably add and/or multiply the flags of each component to get a combined flag)

Also it would be interesting to make a "truth" plot. Using only the centroid file you can plot the number of each class as a function of magnitude (including skipped ones)

Thanks! That's a good idea!

For the plots from yesterday, is the BF as large as we saw it before? (There was a small change in the GalSim 2.1's diffusion model so would like to see there was no big change).

I think that these same exact visits are not available for run 2.0i. I would say that the effect is smaller than in run 1.2i though (the PSF for that particular visit was slightly larger but comparable):

fjaviersanchez commented 5 years ago

To be more concise.

Run 1.2i:

Run 2.1i

cwwalter commented 5 years ago

Yes, that's what i was afraid of. It looks like BF is missing right? Or am I comparing the wrong plots? I see it clearly in 1.2 and not at all in 2.1...

cwwalter commented 5 years ago

Sorry, I should have specified that. I normalized all the histograms to check that the distributions had reasonable shapes.

OK thanks. I'm sort of surprised. Maybe the matching isn't really working but these distributions basically look all the same for all the different classes. I would have expected all the dim ones to be on the low end. The "truth" plot should show that I think. I don't expect to tag as many "dim" objects at 20 as 24.

rmjarvis commented 5 years ago

Hm. That B/F plot does look like a problem. The other thing that doesn't seem right is the magnitude histogram. The flag=2 (and 6) objects should all be at mag > ~26 I believe. Could most of these be bad matches? (Similar point to what Chris just posted.)

rmjarvis commented 5 years ago

Also flag=12 should all be mag < ~15 I think.

cwwalter commented 5 years ago

@jchiang87 can we confirm that the flags for the test were correct and --disable_sensor_model wasn't turned on or something?

@fjaviersanchez I think in addition to the shape we should see this with absolute normalization. Maybe something is wrong in our logic and we are accidentally now turning off the sensor model for most objects with our new speed up code.

The "truth" plot should also tell us what the alogorithm thinks it did as a function of magnitude.

jchiang87 commented 5 years ago

@jchiang87 can we confirm that the flags for the test were correct and --disable_sensor_model wasn't turned on or something?

No way was --disable_sensor_model turned on. These data have tree rings; plus FFT rendering would not occur if it were. As for confirming the flags are correct: There should be enough info in the centroid files to determine if the reported flags are at least consistent, i.e., flag=12 will have large Flux values, etc.; flag=2 or 6 should have small Flux values. Checking one of the files, this seems to be the case. Plus, the code looks correct.

jchiang87 commented 5 years ago

It seems to me that if you want to determine the distribution of object magnitudes (or fluxes) as a function of the rendering flag value, you can do that with just the centroid file, i.e., you don't need to match against an external catalog or use any outputs from the DM stack processing.

cwwalter commented 5 years ago

OK that's what I figured.. the plot showing the categories might just be bad matching. The truth plot will show us what we expect but it sounds like you already peaked at files and it looks reasonable.

If that is all true; the BF is harder to understand. There was this diffusion change but (assuming it is correct) I don't think this should be a large effect.

Javier, how exactly are you calculating deltaT again? Maybe it is a measurement issue?

cwwalter commented 5 years ago

It seems to me that if you want to determine the distribution of object magnitudes (or fluxes) as a function of the rendering flag value, you can do that with just the centroid file, i.e., you don't need to match against an external catalog or use any outputs from the DM stack processing.

Right... that is what I am talking about with the "truth plot". Just use the centroid file itself.

jchiang87 commented 5 years ago

Here is one of the plots I described for a sensor picked at random: centroid_mags_v1918527_r43_s01 The magnitudes were computed using the gain (=0.7), the zero point from the calexp (32.17), and the Flux values from the centroid file.

cwwalter commented 5 years ago

Nice!

That's more like I expected. In case people are wondering, the reason you see stuff down at 40 is that those are dim galaxy components.Disks and bulges and knots(?) are entered separately. Our cut is on the combined galaxy magnitude.

cwwalter commented 5 years ago

And, in the BF region all "normally handled" objects.. 🤔

cwwalter commented 5 years ago

@rmjarvis Does the GalSim test_sensor unit test check if BF is working as expected?

fjaviersanchez commented 5 years ago

Javier, how exactly are you calculating deltaT again? Maybe it is a measurement issue?

I compute T= Ixx+Iyy and deltaT = T - average(T[(mag>20) & (mag<22)]).

Thanks for the plots @jchiang87, I think the way you did it is probably the most relevant to check that these approximations worked properly. However, I wanted to check that this was translated correctly into the images and I think that I'll have to do something more clever with the matching to check that. In any case, I think I am overall happy with the results (modulo BF).

rmjarvis commented 5 years ago

Does the GalSim test_sensor unit test check if BF is working as expected?

Yes. To a point, but it does check that stars get bigger with the silicon sensor for high fluxes. Those all still pass after Craig's update. And he ran more involved validations that tested the slope of the effect to higher accuracy than the unit tests check it.

fjaviersanchez commented 5 years ago

Following up on BF I checked the visit proposed by @jchiang87 and I think that there's no issue with BF being smaller but something (saturation because of the brighter background level?) is kicking in earlier (I also added more statistics):

And zooming in:

jchiang87 commented 5 years ago

The shift to higher magnitudes is just a consequence of using different gains for the two datasets. In Run2.1i, we have gain = 0.7 e-/adu, while for Run2.0i, we had gain = 1.7 e-/adu. That gives a shift in magnitude of -2.5*log10(0.7/1.7) = 0.96.

jchiang87 commented 5 years ago

Thinking about it some more, that gain change explanation for the shift in the saturation onset can't be right. The zero points also shift by that same amount, so the saturation should kick in at the same magnitude, which makes sense since an object's magnitude shouldn't depend on the gain either.

rmjarvis commented 5 years ago

How are we defining saturation? In terms of electrons or ADU? It should be based on the number of electrons in a pixel. But maybe the saturation is defined in terms of ADU, and we didn't change that when we changed the gain?

RobertLuptonTheGood commented 5 years ago

We quote saturation levels in ADU, not electrons, as we expect that the gain is more stable than our knowledge of it.

jchiang87 commented 5 years ago

In the simulation, the saturation level is set at 1e5 e-/pixel. So the bleed trails are simulated before the gains are applied.

fjaviersanchez commented 5 years ago

The objects with flags==12 seem to actually be the brightest in the exposure. Here I am showing an image with markers on all objects with flags==12:

I am having some trouble using base_SdssShape_x/y to match to x,y in the centroid files though. I'll follow up tomorrow.

cwwalter commented 5 years ago

When I was thinking about this yesterday, I looked at Jim's plot above and saw the flagid=12 guys all had magnitudes of less than 15. But we see the saturation kicking in earlier than that. So, I was thinking that couldn't be it.

I wonder if I either misinterpreted the plot or if it is possible something about the FFT PSF for the very brightest objects is causing the photometry to measure them as dimmer objects.

rmjarvis commented 5 years ago

I can't think of any way that would happen. Javi mentioned a brighter sky background, but that also doesn't make much sense, since it would have to be much brighter for it to move the saturation point a full magnitude.

I think the best bet is probably to look at the images of an object that is saturated in 2.1 and not in 2.0. Maybe that will give a clue. @fjaviersanchez Can you point me to where these files are living? I can try to dig in to it.

cwwalter commented 5 years ago

Javi mentioned a brighter sky background

From a discussion last night, with the moon up Jim said the background for this exposure was ~1900 e-/pixel. Typical in the r-band is more like 800.

jchiang87 commented 5 years ago

We should also verify that an eimage with saturated stars that's converted to different raw files using different gain values shows that the saturation onset, as measured by the Stack code, is independent of the gain used. If it's not, then I think the issue is not in the sims.

craiglagegit commented 5 years ago

Hi, weighing in here. There is no allowance for gain in the BF as I have implemented it in GalSim. I always understood that the images I was dealing with were in electrons. So if this isn't the case, in other words if the images my code is being presented are in ADU, then the gain change could explain what has changed. How was the gain change accounted for? Was the flux of objects at a given magnitude in ADU reduced so that you got the same number of electrons before and after the gain change?

fjaviersanchez commented 5 years ago

@cwwalter @jchiang87 @rmjarvis I discovered why the matching was so bad that it looked that everything was "by chance" matching. The x,y directions in the centroid file are flipped with respect to the DM processed catalogs: x in the centroid file matches with base_SdssShape_y and y matches with base_SdssShape_x. In the example below I am showing with red markers all the objects with flags==12 and all the detected objects with saturated pixels (flipping x,y) and they seem to match well:

rmjarvis commented 5 years ago

I always understood that the images I was dealing with were in electrons.

That is the case. We build the image in electrons, then at the very end, one step in Jim's afterburner code simulating the readout process is to apply the gain. So that's not an issue here.

fjaviersanchez commented 5 years ago

@rmjarvis the files live here: /global/cscratch1/sd/jchiang8/desc/Run2.1i/focal_plane_tests/output/rerun/jchiang/w_2018_48/sip_order_2

rmjarvis commented 5 years ago

Thanks. Can you also point me to the corresponding 2.0i images? I want to compare the two for the same exposure if possible to see if I can see what relevant difference there might be.

jchiang87 commented 5 years ago

Unfortunately, we don't have a matching exposure in the Run2.0i data.

LSSTDESC / DC2-production

Document including the features/issues found in validation for the different runs #291