DC2 time domain simulation design: area, cadence, overabundance

drphilmarshall commented 6 years ago

What should the time domain part of the DC2 simulation look like? Let's use this top comment to develop a design for the area, cadence, and overabundance of time domain cosmography targets.

A significant issue that emerged in the discussion at Stony Brook is the cost of the difference image analysis (DIA): I suggest we first define the area of sky that will have DIA run on it, and the corresponding (over-)abundance of type Ia SNe, other transients, lensed quasars, and lensed SNe. Then, we specify the cadence we'd like to see in this DIA region. (I'm assuming we can make like Twinkles and use the same sky region for both SN and SL work.) We could think of dividing the DIA region into one or more single-field DDFs with higher than DDF frequency time sampling (to allow down-sampling of the light curves), and a potentially larger sub-region with a mix of baseline cadence and rolling cadence. For the non-DDF ("WFD") DIA region we said 4 years baseline, 6 years rolling in the SN wishlist. Here's a completion exercise for us then:

DIA region area: AA sq deg
DIA region abundance:
- Type Ia SNe: BB per sq deg
- Other transients: CC per sq deg
- Lensed AGN: DD per sq deg
- Lensed SNe: EE per sq deg
UltraDDF area: FF sq deg
WFD cadence: Years 1-4 baseline (minion1016), years 5-10 rolling (which simulation?).

Suggestions for the values of the above variables welcome in the comments below!

Could a single 10 sq deg LSST UltraDDF field be big enough for all our SN and time delay lens needs?

Last question: do we still want to populate the non-DIA area (that won't have DIAForcedSource lightcurves) with SNe and lenses, at the natural density?

For reference, the evolving matrix of DC2 properties is in the DC2 wishlist spreadsheet - we can update this with our plan once we are happy with it.

Asking these questions initially of @reneehlozek @rbiswas4 @saurabhwjha @tcollett @dannygoldstein @jbkalmbach but please do @mention others who would like to help design the SN and SL fields - especially those people who are planning to write papers with the DC2 dataset!

reneehlozek commented 6 years ago

Hi @drphilmarshall

Apologies for the delay. (also cc @cwwalter since I also dropped the DC2 ball for him recently!)

@rbiswas4 and @saurabhwjha have been discussing this with me, and we came to the conclusion that having a 10 sq degree ultraDDF field observed every night it is visible and in all filters would allow us to test our coadds, and also that the difference imaging is stable at different depths etc., and would allow us to look at different dither patterns.

Ideally, we'd like a full survey on that field, but if we have to cut we'd cut time rather than size of the field for the above reasons.

I'd be interested in what the SL folks think @dannygoldstein

drphilmarshall commented 6 years ago

Thanks Renee! Quick questions while others chew on the 10 sq deg ultraDDF concept:

How many SNe (and of which types) would you like to sprinkle into this field? The natural abundance, or some artificially high incidence to get your statistics up, like we did in Twinkles?
I like the way the nightly observations will allow you to define a number of different reference images and light curves via multiple processing runs of the same DIA pipeline (but given a different visit list each time). What do you propose for the visits within a single night? Will you want the minion_1016 baseline cadence visit list to be one of your possible campaigns? And how many different "runs" do you imagine wanting to do? (Watch out: they are not free :-)

dannygoldstein commented 6 years ago

Hi all,

A few thoughts:

I think only doing DIA on an UltraDDF could work for the SL group's needs if and only if WFD is recoverable as a subset of the observations. I was chatting with @rbiswas4 about this, and basically the idea would be to first observe the UltraDDF at the nominal WFD cadence, and then add in "deep" observations after the fact to attain the desired depth for the SNWG. What do people think about this idea? I know there is some flexibility on the DDF cadence for DC2, so this could be a way for everyone to get what they need.

The reason SL has this requirement is that nearly all glSNe and lensed AGNs will come from WFD. So to accomplish some of our DC2 paper goals (detailed cosmology forecasting with lensed SNe and AGNs) we need WFD-like observations.

@tcollett can speak more to the number of lensed AGNs that should be sprinkled into DC2, but for glSNe it would be good to have ~1000 detected ones (z ~< 2, i ~< 24) to play with. Over 10 years, we should expect 1000 glSNe Ia detected over the LSST footprint in the real survey, so this amounts to an increase of the glSN Ia rate by a factor of ~2000 -- is this reasonable / doable in 10sqdeg? What are the SN Ia folks doing as far as rate amplification @reneehlozek @rbiswas4 ? Perhaps @tcollett can comment on the expected number of galaxy-galaxy lenses in 10sqdeg -- this number sets an upper limit for the number of glSNe we can sprinkle.

A thought as I type this: is increasing the galaxy density for the DIA region an option @cwwalter ?

It would be good to see both the rolling cadence and the nominal cadence if WFD will be a subset of the UltraDDF observations, I know @shsuyu is interested in looking at the effects of this on the glSN science case.

drphilmarshall commented 6 years ago

Thanks, @dannygoldstein ! I think a (roughly) 10 sq deg UltraDDF patch with WFD, rolling and DDF cadence sounds like an excellent plan. In Twinkles (~100 sq arcmin) we sprinkled in ~100 lens systems, so at the same density we'd be looking at ~10^4 systems in the UltraDDF. We should probably try to avoid collisions between strong lenses and supernova hosts - did we keep them separate in Twinkles, @rbiswas4 ? The natural rate of galaxy-galaxy strong lenses is about 10 per sq degree, but in the UltraDDF we can simulate an over-abundance pretty safely I think (although we might want to place the field at the edge of the survey area (or perhaps off to one side) so as not to interfere with the WL projects.

Talking of field placement: do we expect the DDF observations to have large dithers, @humnaawan ? If not, maybe we should big-dither the WFD and rolling cadence visits, but not the DDF ones.

drphilmarshall commented 6 years ago

BTW @reneehlozek are you saying that the SN group does not need SN to be added to the main (300 sq deg) DC2 survey area, and that the ~10^4 systems in an over-abundant UltraDDF region would give you enough objects? Should we put them in anyway, even though the difference imaging analysis would not need to be run?

rbiswas4 commented 6 years ago

Sorry about the delay on this. I have been caught up a bit in other stuff. Just two quick points here:

how many pointings are we talking about? (ie. How ultra can ultraDDF be?). And what does rolling cadence mean in this context ... is everything not covered and constrained by the number of pointings we can have?
Secondly about dithers: DDF in LSST must have very small positional dithers, but rotational dithers are possible.

egawiser commented 6 years ago

+1 to what Rahul just said - we don't want to use large translational dithers in DDFs (simulated or real) unless we want to make them cover 4X the area at a depth not much greater than WFD. But rotational dithers are optional - unless somebody doesn't want those, they could be included in the same way as for DC1, that is randomly shifting the instrument rotator at each filter change, since we know that can be done without significantly affecting the LSST observing efficiency or overusing the rotator.

drphilmarshall commented 6 years ago

Sounds good to me, Eric.

Rahul: I think "rolling cadence" here means we choose a sky position (or several) in the WFD area from a rolling cadence simulation (or an emulation of one) and include those visits in the list to be simulated. We'd then have a number of labels attached to the visits: WFD, RC1, RC2, DDF, etc, so that we could select out each cadence and process it. Do you foresee any problems with this approach? Will it give you useful and interesting light curves?

drphilmarshall commented 6 years ago

OK, it sounds like both SL and SN will be happy with a basic plan of a 10 sq deg UltraDDF field, with no big dithers but realistic rotational dithering. Cadence will be defined by the SN group but the visit list must include a realistic set of WFD visits, and then whatever other visits are needed in order to explore both DDF and rolling cadence schemes. I'll add this to the DC2 plan doc on overleaf.

@rbiswas4 @reneehlozek If you do all your light curve analysis in the over-abundant DC2 uDDF area, and we do not difference the main survey images, do you care whether the main survey contains supernovae or not? We should probably include them, at the natural rate, for completeness - and to show that we can - but I wondered if you would mind if we didn't manage to do this. Basically, do you have any projects in mind that do not need light curves?

Likewise, do you care what the cadence is in the main survey area? Baseline WFD, or some rolling cadence? (That's a question we'll need to ask the other groups too.)

drphilmarshall commented 6 years ago

One more question: what other time-variable objects do you want in the uDDF area? @danielsf tells me that the following

have already been implemented in CatSim and can be dropped into the DC2 simulations with essentially no work:

Rahul's Type 1a Supernova model (based on sncosmo's SALT-2 models)

Mdwarf flares

RR Lyrae

AGN (based on the damped random walk model of Macleod et al)

"Main sequence variability" based on Kepler (basically, we took all of the available Kepler light curves and used color-magnitude matching to assign them to the main sequence stars in our model Milky Way)

My guess is you'd like all of these turned on - and maybe some non-Ia SNe added as well (although I'm guessing that will take work). I understand the relative abundance of the different populations is important, for the machine learning training. Interested to hear what you think the DC2 design should be.

dannygoldstein commented 6 years ago

Hi @drphilmarshall, for convenience here is a summary of current SLWG responses to relevant questions from the beginning of this thread. The abundances provided are lower limits.

DIA region area: 10 sq deg (UltraDDF only)
DIA region abundance:
    Lensed AGN: 100 per sq deg (minimum)
    Lensed SNe: 100 per sq deg (minimum)
UltraDDF area: 10 sq deg
WFD cadence: Years 1-4 baseline (minion1016), years 5-10 rolling

Would be good to get both lensed SNe Ia and CCSNe -- @rbiswas4 is your SN model the only one that catsim can interact with? If so how hard is it to switch out the backend sncosmo model from SALT2 to a core collapse template?

rbiswas4 commented 6 years ago

is your SN model the only one that catsim can interact with? If so how hard is it to switch out the backend sncosmo model from SALT2 to a core collapse template? @dannygoldstein

This is something that I can so (and will have to do at some point). So, if you have a good use case, before DC2 is the time.

The issues that I see as hard/things to decide on are :

We can use the SNCosmo Time Series models for CC. As you have also noted, there are problems with these. Is this ok for what you need in DC2 though? Alternatively if you have versions of these, we could just as easily use them.
Rates and brightnesses: In order to put these in we need to define a population density in terms of redshift n(z) which is the number density per spatial comoving volume per unit observer frame time, some host-SN density relation ie. which galaxies host such SN, and a distribution law on the galaxy (eg. follow the light? ). Finally we need to use some luminosity function for the peak brightness in some intrinsic rest frame brightness and fractions of different types. If you have numbers (particularly for the last sentence, or don't care, and just want something, we will be in a position to do this).

A minor point, IMO, is that we change things from SALT too: in an image simulation SEDs have to be positive semidefinite, but I don't believe it will change anything.

dannygoldstein commented 6 years ago

I have n(z) rates and CCSN SEDs - I can send you these or integrate to a github repo, whatever is easiest for you

dannygoldstein commented 6 years ago

Also I think following the light is fine - this is what I have proposed on another thread and it is what is being done in DES

rmandelb commented 6 years ago

Time for a check-in on the time domain part of DC2. My understanding of the situation after last week's SSim discussion is as follows:

@jchiang87 and @salmanhabib were going to provide input on what our computing budget allows (both CPU time and storage). This will help decide the field size, which factors into the decision about how much the SL and SNe need to be overloaded (density higher than usual).
Then based on the above, @dannygoldstein and @rbiswas4 will propose a density of SL and SNe.
We also need the inputs; I see something from @dannygoldstein above about CCSN SEDs and n(z) rates, but on what timescale will those be available?

Is there progress on the above? Apologies if I missed something that was posted elsewhere.

salmanhabib commented 6 years ago

For the record there was a conversation on the dc2 Slack channel. I reproduce it here, will talk more to Jim tomorrow to better fix the numbers. We will no doubt have some questions for the Sn gang.

Phil @habib I didn’t see you on the SSim call earlier, but I think you’ll be as interested in these slides as @jchiang87 was! Bottom line: the time domain part of DC2 is potentially as expensive as the main survey and so we (as a collaboration) have some trade-offs to consider.

Salman @drphilmarshall I took a look at the requirements. Since we can do (quasi-)single chip jobs it would be good to know how many single-chip exposures you guys are interested in. In these units, the DC2 main survey is roughly ~10M single-chip visits.

Phil @habib on the back of my envelope I make 20,000 uDDF full frame visit images equal to 7.6M single chip exposures (just multiplying by 2x189). Good idea to think about reducing the DDF visits from 189 chips to fewer than that (as if a fraction of the camera somehow got turned off for each DDF visit). Here’s an extreme version: squeezing all the strong lenses into a single 3x3 raft would represent a factor of 189/9 ~ 20 reduction, making the uDDF a 5% perturbation on top of the main survey. @jchiang87 suggested this as a good thing for the SN and SL groups to think about. At Twinkles density (1/sq arcmin) a raft area (9x13x13 arcmin^2) would yield 1500 lenses, which might be enough - but to get the SN numbers up, they’d have to overload the field as well, and that might start interfering with the static science. But maybe there’s a sweet spot between 9 and 189 chips?

Salman @drphilmarshall This is good, I remember thinking along these lines at the Stonybrook meeting. I don't believe we have a problem here as long as the sweet spot is in the regime of less than 30 chips per FOV (there's a contingency of roughly 30% in the current NERSC request plus we have more time at ALCF). Doesn't look like the storage overhead will be too much of an issue (we can always trade some of this for compute). I will talk some more with Jim and others next week but the hope is that no trade-offs will be required.

rmandelb commented 6 years ago

@salmanhabib - thanks for the quick response. Having this in GitHub is a little more convenient than slack since we all have to think it over!

If we were to go with 3 rafts, then we could avoid having too high a density even with SL and SN. This would make the uDDF 15% of the main DC2.

Question: were you envisioning doing something like "WFD at NERSC, uDDF at ALCF" or is it likely to be more complex than that? Does that fit within the profile of time available at ALCF? Is there some logistical or other reason why we couldn't split that way?

rbiswas4 commented 6 years ago

@rmandelb I think what @salmanhabib posted is where the conversation stands. I plan to put some more work on DC2 to give a better estimate of overdensity if we go to 3 rafts this weekend and explore if we could do with lower epochs and higher rafts (keeping the total single chip jobs in control). I will also put material into the DC2 document.

The conversations I have had with @dannygoldstein suggests that he has a good idea of what n(z) and CC templates he wants to use. This likely does not represent the entire population of core collapse SN (which is much harder), but that is not an issue for DC2. From the SN POV, I was not keen on having core collapse SN, which means we don't care about the population of core collapse for DC2 and are perfectly happy with whatever models and (reasonable n(z), ie. something remotely close to what people believe is natural) Danny has. Addition of that into simulations will involve me coding a few things and can be done at reasonably short time scales (~< week) when I start.

salmanhabib commented 6 years ago

@rmandelb We could certainly do uDDF@ALCF. It makes sense to not split this smaller task into chunks to avoid multi-tasking overhead in people time. There's no issue with compute resources.

Would be good to get this activity started ASAP on a more solid footing with the SN folks. Suggestions welcome!

drphilmarshall commented 6 years ago

@rbiswas4 @dannygoldstein I updated the DC2 plan document with the 3-raft design for the uDDF, here's what Table 3 ("Cadence and Survey Options") now looks like:

    Ultra-DDF:         &  \\
    \hline
    Image area         & $\sim1.25$ sq deg (3 rafts, 27 sensors, 4600 sq arcmin, 
                         15\% of a full FoV), embedded in one corner of the main 
                         survey region \\
    Campaign length    & 10 years \\ 
    Cadence            & \texttt{minion\_1016} WFD, plus \texttt{minion\_1016} 
                         DDF visits (re-arranged from year to year to emulate 
                         enhanced cadences) \\
    Number of Visits   & $\sim 20,000$ total \\
    Input              & See Table 1 and 2, plus: over-abundance of lensed 
                         AGN ($\gtrsim 1000$, 0.2/sq arcmin) and lensed 
                         SNe ($\gtrsim 1000$, 0.2/sq arcmin), and: 
                         $\sim6\times$ over-abundance of SNe Ia (25000, 5/sq arcmin), 
                         core-collapse SNe (similar) \\
    Dithering          & WFD visits dithered and rotated the same as in main survey. 
                         DDF visits to have small (chip-scale) dithers and rotations only.   \\

Notes:

The overall lens density would be less than that in Twinkles (0.4, compared to 1, per sq arcmin). These seem OK to me given your stated required yields, Danny.
We'd need a 6-times-higher than natural rate to get the required 25,000 SNe Ia.
- Do we also need 25k core-collapse SNe, Rahul?
- 50k SNe in 4600 sq arcmin is an SN density of about 10 per sq arcmin. This is getting close to the WL source galaxy density! This may mean that it is difficult to find enough suitable hosts to add SNe to - or, if we can use all available galaxies, the SNe might interfere with using the DDF images for WL calibration via BFD. How do the redshift distributions of these 25k SNe, and the 50k WL galaxies, compare? Can we re-use the same hosts for multiple SN events, Rahul? (Perhaps in different seasons, it's OK?)

drphilmarshall commented 6 years ago

Regarding re-use of hosts: Bob Nichol popped up in the strawman design slides to offer this:

"High-z SLSNe can last hundreds of days. We see some DES SLSNe exist of multiple (4.5 month) seasons."

So this could make re-using host galaxies difficult, as the SNe pile up in them. We'd clearly need to pay attention to this when preparing the instance catalogs, @rbiswas4 - and perhaps reduce the number of SNe from 25k as a result. How few SNe could we get away with, Rahul? What's your current assessment of the 1.25 sq deg uDDF design?

drphilmarshall commented 6 years ago

PS. @rbiswas4 one last thing: have I got the wording right regarding the labeling of visits in the DDF sky region? I want to say that some of the DDF visits are also labelled WFD - such that if you select just the WFD ones, you get a uniform WFD-depth survey covering that sky patch? Great if you can check that the text in Table 3 makes sense in this regard... Thank you!

rbiswas4 commented 6 years ago

@drphilmarshall

I want to say that some of the DDF visits are also labelled WFD - such that if you select just the WFD ones, you get a uniform WFD-depth survey covering that sky patch?

I might not be understanding you exactly, but if I am: yes! More mechanically, It is true that there are DDF visits that are also WFD in minion. The plan is to treat these duplicates as bonafide WFD, with WFD dithers etc. Any changes for enhanced cadence for uDDF will happen for the DDF visits that are not WFD also. Thus any property of minion WFD is being preserved by construction during the time you are running minion.

Does that address your question?

rbiswas4 commented 6 years ago

@drphilmarshall

Regarding re-use of hosts: Bob Nichol popped up in the strawman design slides to offer this:

"High-z SLSNe can last hundreds of days. We see some DES SLSNe exist of multiple (4.5 month) seasons."

This is true, but (a) we don't have SLSNe for DC2. They would be good though, undoubtedly!

Second, while SLSNe have those properties, they are extremely rare: There is a reason they had not been discovered till recently. I just found the following abstract https://arxiv.org/abs/1605.05250 which claims that the rate is equivalent to '2.2 (+1.8/-0.9) x10^-4 of the volumetric core collapse supernova rate at the same redshift'.

rbiswas4 commented 6 years ago

@drphilmarshall There is one point I forgot to address:

I was not planning on adding 25K core collapse separately. We can add the ones @dannygoldstein needs for the lensed supernovae and not add anything else from the SN group standpoint.

I would be happy to take a look at the galaxy sample for WL to address your concern, but I believe the number of SN is not a worry. (In reality the relevant quantity is the redshift distribution, and since I am thinking of the normal ones, these are more at the highest redshift where the volume is). In fact, we can cut down the higher redshift ones if there is a speed issue in simulations, but they should not show up very much in the DM images. I was just going to keep them if speed was not an issue, to learn more about the redshifts at which we lose the SN.

Also worth noting is the fact that core collapse SN are dimmer than SNIa, but are more in number at higher redshifts and the ratio (quite uncertain from current data) was thought to be ~ 10 at redshifts of around 1.2. So I think the ~ 6 overdensity is not a terrible situation to have.

katrinheitmann commented 6 years ago

@drphilmarshall and others: have we converged on this for DC2? Or has this been by now replaced by another issue? If not, what needs to be done to finish this up? Thanks!

katrinheitmann commented 5 years ago

@drphilmarshall and others: nothing much has happened here for 8 months or so. Are we done with this and close the issue? Thanks!

katrinheitmann commented 5 years ago

ok, after I pinged @drphilmarshall in March 2018 and September 2018 and until April 2019 did not get any response, I assume we can close this. @rbiswas4 and @reneehlozek If you think this is the issue in which we capture final conclusions on DDF and uDDF please feel free to reopen it.

LSSTDESC / DC2-production

DC2 time domain simulation design: area, cadence, overabundance #1