Closed jchiang87 closed 5 years ago
To-do list for correcting the imSim PSF:
AtmosphericPSF
class.We can probably reduce these data if necessary; the HSC processing is doing OK down to 0.5" (2.98 pixels) and certainly runs down to 0.4" (2.34 pixels), although the PSF size estimation is off by c. 0.3% at that point (this is a figure from Bob Armstrong). The different plots are different Lanczos interpolation kernels while warping the data -- we're thinking about how to do this more cleverly/better.
I don't know how well the code will handle the particular sorts of undersampling that imsim is producing. It is, of course, not hard to handle undersampled PSFs if you forward model (this is using my hacked-up PSFex, but piff does/will do better), but it doesn't help with resampling the data. We're not planning to run imcom, and there's no "multifit" code at present -- and it has real problems with complex blends; but that's a different albeit interesting problem.
R
@jchiang87 , how are the FWHM values for the delivered images being measured? I noticed today that galsim.Image.calculateFWHM()
has a bias towards smaller values when used with noisy images (see https://lsstc.slack.com/archives/C978LTJGN/p1547503567238900). We should make sure we understand any such potential biases of whatever algorithm we use when making comparisons to the target FWHMs.
Here's the function I'm using:
def calculateFWHM(visit, band, rawSeeing, airmass, scale=0.02, npix=400,
flux=1e8, psf_type='atm', **kwds):
global logger
rng = galsim.UniformDeviate(int(visit))
if psf_type == 'atm':
psf = desc.imsim.AtmosphericPSF(airmass=airmass, rawSeeing=rawSeeing,
band=band, rng=rng, **kwds)
elif psf_type == 'kg':
psf = Kolmogorov_and_Gaussian_PSF(airmass, rawSeeing=rawSeeing,
band=band)
else:
raise RuntimeError('unknown psf type: %s' % psf_type)
star = psf.applyPSF(0, 0)
star = star.withFlux(flux)
image = galsim.Image(npix, npix, scale=scale)
image = star.drawImage(method='phot', add_to_image=True, image=image,
maxN=int(1e7))
#image.write('v{}-{}_{}_psf.fits'.format(visit, band, psf_type))
#logger.info("%s: calling image.calculateFWHM" % time.time())
fwhm = image.calculateFWHM()
logger.info("%s %s %s" % (time.time(), int(visit), fwhm))
with the default values as shown. I did convergence tests for flux
, npix
, and scale
above, as well as for the screen_scale
passed to the AtmosphericPSF
initializer. The default values in the above code and in the current imSim code seemed fine. Any other checks that should be made?
Thanks. I think those convergence tests sound great. 10^8 is a factor of 100 more photons than I was seeing problems with, so a completely different regime.
Should we consider a 4th option?
If I am understanding the plots above correctly, the PSF is significantly wrong for all visits so far (even when they don't break the stack), so I doubt it will be worth the trouble to combine this initial 10% with the later corrected 90%.
Thanks @dkirkby . Your suggestion is similar to where my mind went, which is as follows:
Assuming that the sensor visits that’ve already been simulated are a representative subsampling of visits, I would think we should go with 1 for the near future. This is the “safest” in some sense, in that it doesn’t require us to immediately spent some CPU time regenerating visits that we already have completed. Then, if we do want to have a slightly lengthier discussion about potential impacts of some of the early sensor visits having too-small PSF, it seems to me that we can do that if we choose option 1 and then later convert to 2 or 3 if the lengthier discussion leads to a conclusion that regenerating some/all of the earlier sensor visits is needed. But if the discussion leads us to conclude that regenerating them isn’t needed, then we won’t have spent some of our CPU time regenerating them before moving on to new visits.
As @dkirkby says it may depend on what fraction of the eventual total Run 2.1i dataset these early visits will turn out to be.
Regarding option 1 in particular, I think one little bit of nuance that matters is the fact that some of these sensor-visits will potentially be part of a visit that has not been completely simulated yet. In that case, we would be spending some time finishing the rest of the visit with the too-small PSF rather than having a mix in a single visit of too-small PSF and correct PSF (I assume). I am not immediately sure how much follow-up this would be. It would depend on how much of the Run2.1i dataset has "too-small PSF" problems, the number of objects on those visits, and the position on the simulation space of those visits.
@villarrealas - how difficult would the bookkeeping be if we were to say that for incomplete visits, we will not (at first) go back and finish the rest of the sensors in that visit, so that we can decide later how we want to deal with those cases? And instead we start generating new visits with the updated ImSim. That would be consistent with the spirit of @dkirkby 's and my suggestion.
As you say, it would be useful to have an estimate of what this looks like in terms of how many partially complete visits there are.
I am attaching below the results comparing a full focal plane (visit 421694, with finSeeing
= 0.77" -- note that finSeeing
would be, approximately, an upper limit to the delivered seeing) before and after adding a 0.3" Gaussian to the "old" PSF (the original from run 2.0i and 2.1i).
As a summary: the size of the PSF increases (as expected) and the PSF ellipticity, |e|, decreases. The general pattern of the PSF is the same. @rmandelb @rmjarvis @jmeyers314 are there any other tests that you would like to see?
@rmandelb I don't think that is too difficult. I could do a quick search for checkpoint files to make a list of which visits to remove from our worklist and so anything from those detectors would just not be included.
Regarding the psf comparisons that Javier posted above: Lynne Jones noted in the #desc-dm-dc2 channel that the finSeeing
values in the minion_1016 db were not calculated using the most up-to-date formulae from System Engineering, and in addition were copied from the wrong column in the Summary
table. finSeeing
was copied from FWHMeff
, but we want to use FWHMgeom
(which also uses the older SE formulae). Using the current formulae, given the rawSeeing
and altitude
values for this visit, I get FWHMgeom = 0.68"
.
@jmeyers314 @rmandelb @rmjarvis Are you all happy with the PSF properties that Javier showed in his pdf document, or do we need to do more testing of and/or work on the imSim PSF? I'm wondering in particular about the ellipticity distributions on slide 5. When I picked visit 421694, I didn't realize that we wanted something within 10 deg of zenith for comparison with the LSST SRD specs. We can adjust the altitude and seeing as needed if we want to do another run.
Regarding the distributions on slide 5: you're right that the LSST SRD specs for PSF ellipticity are for a particular range of zenith angles, and this visit is nominally outside that range. However, as we've discussed elsewhere, these distributions within a single visit are not super meaningful on their own as a test of consistency with the LSST SRD (it's really the distribution across many independent realizations of the atmosphere that matters, so a single visit can only be somewhat indicative, not definitive). My feeling is that the single visit test is more useful for simply testing that the PSF FWHM fix is working as expecting.
@rmjarvis do you think that it's important to have multiple visits for a test of the PSF ellipticity distribution?
It'd be nice to see whisker plots from more exposures in the already-processed data (keeping in mind that the PSFs are too small). Are those available? Other than that though, I'm happy to proceed with the new PSF model.
@jmeyers314 would you want a plot for each visit show the average over several visits?
I'd like to see the distribution of whisker plots; for as many as you can muster.
@jmeyers314 I am putting some below. Please let me know if you'd like to see even more.
Thanks @fjaviersanchez
I'm curious what @rmjarvis and @RobertLuptonTheGood think. These look like maybe the correlation length is a bit too long to my eye.
We could modify that by increasing the altitude of the "ground" layer, though I don't know if making more modifications at this point is really a good idea. I could probably cook up some dedicated simulations to study this more but it would delay things a couple of days, and I know folks must be anxious to restart.
I agree with @jmeyers314, those look far to correlated to me (especially 221574 -- I'd have said that the guiding was off in real life or great atmosphere with lots of ground layer). I don't have an opinion on whether this matters for DC2.
I agree these are probably too smooth, which will make life easier for the PSF interpolation. But I also don't think we should modify anymore. We already knew that we were making some significant approximations in the PSF models, so it's not too surprising that things look a bit off. They aren't so far off that I think any aspect of our DC2 goals will be compromised.
For DC3, we can (should) try to do a more in depth investigation and try to match up with statistics from comcam or similar data.
That works for me. I personally am satisfied to close this issue and move forward with production.
I'm still confused by the ellipticity distribution on page 5. It seems this isn't consistent with the SRD requirements. But, as Rachel noted this is just for one visit.
Were the distributions we decided were OK before averaged over many visits?
Yes, the plots where we decided the distributions were ok were made stacking several visits with low airmass (alt>80 deg) and seeing close to the nominal (0.7").
After chatting with @bxin about this, he recommended that we should try to match the FWHMgeom
values in the opsim db tables. In this case, convolving with an additional Gaussian with FWHM 0.3" is not sufficient. Tuning the number by eye, a Gaussian with FWHM of 0.47" looks reasonable:
Unless I hear otherwise, I'll update imSim to use 0.47" by default.
Do we really expect to never get FWHM < 0.47? That seems to me like a somewhat high lower limit, but I don't have any actual knowledge about what we expect.
0.47 will roundify the PSF quite a bit more I suspect. Maybe use something like 0.35 or 0.4 to strike a balance between size goals and ellipticity goals? Or else, turn up the atmospheric or optical perturbations instead to inflate the size but without corresponding roundification?
@rmjarvis , @rmandelb , do you have any thoughts regarding the compromise between PSF size and ellipticity goals?
I guess one way to decide may be to look at a few test focal planes of stars only at different seeings for each of FWHM = (0.35, 0.4, 0.47)?
Not particularly. 0.47 sounds too large for a lower limit, but I don't know that we'll actually know the real value until we have real data. Similarly, we don't really know what the expected ellipticity distribution is. It is entirely possible that the "requirements" are much more elliptical than we will get for the real PSF distribution.
So I don't know what the best choice is for these for DC2. Probably the primary goal is to not produce unphysical data that DM cannot handle. OTOH, we do want to push close to the hardest limits of what the real data might produce. I suspect just picking any one of these proposed values won't be a bad choice.
I apologize for the delay in responding - I somehow missed this.
I largely agree that 0.47 sounds like too high of a lower limit, and clearly we need something >=0.3 so as to not produce something unphysical. There are too many other uncertainties to say more; I think we'd be OK going with anything in the 0.35-0.4 range. Could we pick a number in that range semi-arbitrarily, pick a few test focal planes strategically (e.g., ones that couldn't be processed before), simulate them, and make sure we're OK with proceeding?
Sure. I can run the three values that Josh identified, (0.35, 0.4, 0.47), for a few of the visits that failed from before, and we can have a look.
@jchiang87 Hi Jim, the test you describe above was done, right (I remember the discussion at Berkeley)? Do you want to repost the results here for completeness and just make a note what we decided in the end (0.4 if I remember correctly) and close this? Are we going to come back to this question in the future? (Since the conclusion was more like "we have to make a decision and this is a good compromise" rather than "this is clearly the one and only way to go") Thanks!
Javier presented the results of those runs at the XWG Computing group session at Berkeley. The relevant plots (starting on slide 7) show the distribution of ellipticities for the three Gaussian FWHM values. The rationale for choosing 0.4" was summarized succinctly by Josh in a discussion on Slack:
Another problem a few of us in the room had with 0.47 arcsec is that this would then set a hard limit on
the minimum possible PSF size, which goes against our experience on other significantly older 8m class
telescopes where 0.3 arcsec is not unheard of. Our understanding of the origin of the 0.47 arcsec
number is that it’s derived from a seeing budget rather than a seeing prediction. So I think 0.4 is a
reasonable value to choose.
From single frame processing of Run1.2i data (see #307), we found that the PSF used for generating the images for a number of u-band and g-band visits is under-sampled and is therefore causing problems for the PSF model used by the LSST Stack. This has been diagnosed at DM-16785 and in the #desc-dm-dc2 Slack channel as being due to the imSim PSF neglecting some instrumental contributions to the realized PSF that amount to ~0.5 arcsec FWHM difference (added in quadrature) when compared to the "physical" PSF predicted by OpSim. The fwhm values for that "physical" PSF are computed for each visit and contained in the
FWHMgeom
column in the opsim dbSummary
table, and defined asFWHMtot
in LSST Document-20160.For Run2.1i, approximately 2900 visits have already been simulated, at least in part, using a PSF that doesn't account for the aforementioned instrumental contributions, and roughly 30% have seeing better than the best seeing (~0.5") in the
FWHMgeom
distribution. Here are plots comparing the realized imSim PSF fwhm value with theFWHMgeom
values for the Run2.1i visits that have simulated sensor-visits:Based on descriptions of the various instrumental contributions to the seeing made by the Systems Engineering team, it has been estimated that ~0.3" could be added in quadrature to the current imSim PSF to account for them. The effect of including that component is shown in the above plots.
How should we proceed with the Run2.1i simulations? Clearly, we need to update the imSim PSF model to account for the previously un-modeled instrumental components, perhaps by including an additional 0.3" Gaussian, but what should we do with the existing data? Some options:
The consensus at the CI meeting today was that option 1 is preferred.