LSSTDESC / DC2-production

Configuration, production, validation specifications and tools for the DC2 Data Set.
BSD 3-Clause "New" or "Revised" License
11 stars 7 forks source link

emission line galaxies for LSS group #31

Closed rmandelb closed 7 years ago

rmandelb commented 7 years ago

@damonge and @slosar - one of the LSS group's requests for DC2 was the inclusion of emission line galaxies. Can you please clarify: was this request for the image simulations (300 deg^2) or the larger-area extragalactic catalogs (~5000 deg^2), and how important is the request?

Currently, the photo-z group is planning to add emission lines to SEDs in the extragalactic catalogs, but not on a time-scale that would enable their inclusion in the image simulations. Are you able to work with their outputs in the extragalactic catalog, or did you have something else in mind?

egawiser commented 7 years ago

I think this request percolated up from me originally... Adam Broussard (Rutgers grad student) and I want to study which subsets of our detected galaxies have significantly better photo-z due to strong emission lines that noticeably affect photometry in 1 or more broad-band filters, which could make them a "platinum sample" for clustering analysis. So that's completely doable at the catalog level as long as photo-z's are being calculated (seemingly so :) ) and the input EL galaxies have a distribution of line strengths that includes a tail out to very high Equivalent Width. Eventually we should figure out how much actually observing such galaxies degrades their photo-z's, but that could comfortably be done via image simulations in DC3.

rmandelb commented 7 years ago

@egawiser - very good, now I know who to blame. ;)

So I'm hearing a "yes" to "catalog-level analysis". Perhaps @janewman-pitt-edu can comment on whether the emission line-adding code (in whatever form will be applied later on to the DC2 extragalactic sims) will have a distribution of line strengths with a tail to very high equivalent width?

evevkovacs commented 7 years ago

The latest version of protoDC2 which was released to the collaboration last Friday includes emission-line galaxies. (The photo-z group will use a different model than the one that Galacticus uses for adding ELGs.)

On Mon, 6 Nov 2017, Rachel Mandelbaum wrote:

Date: Mon, 6 Nov 2017 09:00:29 From: Rachel Mandelbaum notifications@github.com Reply-To: LSSTDESC/DC2_Repo <reply+0076fc4dc7ff8745b53bef0118b8429b3aceb721c6c949f692cf0000000116183b0 c92a169ce102ed4b2@reply.github.com> To: LSSTDESC/DC2_Repo DC2_Repo@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [LSSTDESC/DC2_Repo] emission line galaxies for LSS group (#31)

@damonge and @slosar - one of the LSS group's requests for DC2 was the inclusion of emission line galaxies. Can you please clarify: was this request for the image simulations (300 deg^2) or the larger-area extragalactic catalogs (~5000 deg^2), and how important is the request?

Currently, the photo-z group is planning to add emission lines to SEDs in the extragalactic catalogs, but not on a time-scale that would enable their inclusion in the image simulations. Are you able to work with their outputs in the extragalactic catalog, or did you have something else in mind?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.[AHb8Ta3QNxS0QUKJokwFZe7rcBn8-lRUks5szx8MgaJpZM4QTXgH.gif]

rmandelb commented 7 years ago

OK, so then I guess the same question applies to the galacticus model for ELGs: does their model go to very high equivalent width?

evevkovacs commented 7 years ago

The code to add ELGs to Galacticus is new and it would be very helpful if you and Adam could take a look at protoDC2. Can we come up with a vaidation test (eq distribution of line strengths etc.)?

On Mon, 6 Nov 2017, Eric Gawiser wrote:

Date: Mon, 6 Nov 2017 09:07:53 From: Eric Gawiser notifications@github.com Reply-To: LSSTDESC/DC2_Repo <reply+0076fc4d1dd92e68dbab78ba5f385927104fc2c3b35046f292cf0000000116183cc 992a169ce102ed4b2@reply.github.com> To: LSSTDESC/DC2_Repo DC2_Repo@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [LSSTDESC/DC2_Repo] emission line galaxies for LSS group (#31)

I think this request percolated up from me originally... Adam Broussard (Rutgers grad student) and I want to study which subsets of our detected galaxies have significantly better photo-z due to strong emission lines that noticeably affect photometry in 1 or more broad-band filters, which could make them a "platinum sample" for clustering analysis. So that's completely doable at the catalog level as long as photo-z's are being calculated (seemingly so :) ) and the input EL galaxies have a distribution of line strengths that includes a tail out to very high Equivalent Width. Eventually we should figure out how much actually observing such galaxies degrades their photo-z's, but that could comfortably be done via image simulations in DC3.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.[AHb8TehvPUOGA1raD4FmuBEC0WSn8RxZks5szyDJgaJpZM4QTXgH.gif]

evevkovacs commented 7 years ago

We don't know yet, because we haven't had a chance to look at those details, but Adam and Eric could help us develop a validation test to find out.

On Mon, 6 Nov 2017, Rachel Mandelbaum wrote:

Date: Mon, 6 Nov 2017 09:25:14 From: Rachel Mandelbaum notifications@github.com Reply-To: LSSTDESC/DC2_Repo <reply+0076fc4d71effaf646315d148e5c26b05e3432b98fd15e6692cf00000001161840d a92a169ce102ed4b2@reply.github.com> To: LSSTDESC/DC2_Repo DC2_Repo@noreply.github.com Cc: Eve Kovacs kovacs@anl.gov, Comment comment@noreply.github.com Subject: Re: [LSSTDESC/DC2_Repo] emission line galaxies for LSS group (#31)

OK, so then I guess the same question applies to the galacticus model for ELGs: does their model go to very high equivalent width?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.[AHb8TdM8V7qnE8QoynCM1t2ytCZAa40Bks5szyTagaJpZM4QTXgH.gif]

egawiser commented 7 years ago

OK we'll take a look. What's the needed timetable for developing a validation test? (I would envision a K-S test on the Equivalent Width distributions in H alpha and [O III] at z<1 versus what's known from observations, although for our current purposes all that's needed is to have some galaxies with very high EWs rather than the right proportion of those.) And who should we bug when we inevitably need help figuring out how to determine EWs from the proto-DC2 catalog format?

yymao commented 7 years ago

added to #30.

BTW, @evevkovacs, is the full SED already exposed in the reader?

evevkovacs commented 7 years ago

No, right now, the user would have to add the quantities. A feature that I mentioned last week at our telecon would be the ability to fetch all quantities matching a string (eg. SED). WE can add this to our reader, but it might be more generally useful.

evevkovacs commented 7 years ago

@egawiser Bug me if you have problems.

yymao commented 7 years ago

@evevkovacs the reader can already fetch all quantities matching a string ---

gc.get_quantities([q for q in gc.list_all_quantities(True) if q.startswith('SED')])

We can add a syntax candy for this but I don't think it is critical.

I am more worried about the data format of SEDs, because the reader does assume that each quantity is a 1D array (i.e., a scalar for each galaxy). How are the SEDs stored in the catalogs? If this discussion should not happen here, we can discuss by email or at #desc-qa.

evevkovacs commented 7 years ago

ProtoDc2 SEDs are stored as scalar quantities, (one number for each narrow-band filter). Our reader, however, could then put these together and return a vector quantity for each galaxy..which might be convenient for users.

yymao commented 7 years ago

@evevkovacs OK, since the SED are stored in scalar fields we are good for now.

Making GCR to have good support to vector quantities require some more work. Since the user can always create the vector after grabbing the scalar quantities, I'll just leave it at that.

evevkovacs commented 7 years ago

I agree. My intention was that our ANL reader could go through the work of assembling the vector for each galaxy in this special case, rather than expecing the GCR to have this capability. The native quantities are all scalar in this case, so delivering them as a vector for each galaxy is indeed a higher level function.

egawiser commented 7 years ago

How narrow are the narrow-band "filters" used to store the SEDs? That's probably not bad for now, but if they're typical narrow-band filters with width ~50 Angstroms, the loss of emission-line information will hurt what we're trying to measure from DC2 catalogs. If they're <= 5 Angstroms, otoh, it should work ok.

evevkovacs commented 7 years ago

The filters vary in width depending on the wavelength. The specifications were given to us by the photo-z group as adequate for their emission line model. (Galacticus' emission line model uses 3 additional continuum filters to determine line strengths, rather than the SEDs).

Here are the specs form Sam Schmidt and Jeff Newman:

I think that we want ~30 filters (plus we would like to look at the 3 continuum filters that you mentioned on the telecon and in the Slack message, though we'll have to see what their effect is).  The DC1 data had 318 bins from 904AA to 20,069 AA, with linearly increasing bin-size in lambda.  For DC2, I think we want something similar, but we do want to concentrate on the area around the 4000 AA break and spanning the optical with a bit more resolution.  So, I think what we want is something like the constant in log lambda width filters with 5 filters in 1000-3000 AA 20 filters in 3000-10,000 AA 5 filters in 10,000-20,000 AA

So I setup 30 filters for Galacticus, following the above guidelines.

egawiser commented 7 years ago

Hmm... at first glance that will make the DC2 emission-line galaxies much less useful for the LSS project we're working on (though presumably fine for photo-z as originally intended). Hopefully we can fix this for the larger-area catalog-only DC2 simulation though. Let me double-check: I assume that galaxy SEDs are initially simulated at higher wavelength resolution via something like adding an emission-line template to a Bruzual-Charlot model for continuum with absorption lines, then their SEDs are stored in the 30 tophat filters you just described, and then their ugrizy photometry is predicted from those 30 filters. Is that right? If that's right, we can still look at one of the two kinds of effects we're interested in using DC2. Both effects produce improved photo-z precision, but the second one relies upon the very narrow nature of the emission lines. If the ugrizy photometry is predicted from full-resolution SEDs directly and the 30 filters are just for storing the SEDs after that, we might still make progress on both.

janewman-pitt-edu commented 7 years ago

@egawiser : those filters should NOT contain emission line flux, but rather only continuum. It turns out that in SDSS you can predict the lines from the continuum with (in many cases) lower errors than measuring the lines from the spectra. Hence we need the continuum fluxes to run the afterburner to paste on emission lines. Galacticus emission line predictions should be separate.

For a validation test, I'd suggest looking at [OII] equivalent width vs. restframe color, as this is reasonably tight and well-constrained by DEEP2. We generally won't have coverage of [OIII] in spectra from z~1...

evevkovacs commented 7 years ago

@egawiser NO, the Galacticus SED filters are continuum filters. ELGs are added in post processing, based on fluxes in 3 (continuum) H, He and O filters. Apparently, these 3 filters predict emission line strengths with good accuracy. The post processing model gives only line luminosities. I will check the code developers for more details.

egawiser commented 7 years ago

Aha - thanks to both of you for the explanation. It will be helpful to explain at the outset that the stored SEDs are "continuum-only" to prevent others from making the same assumption that I did that ELG SEDs include emission lines... ;) but this makes a lot more sense. At some point of course we'll want to stop assuming that the magic "emission line from continuum" prediction that works for SDSS will work at LSST depth and redshift, but it should be fine for DC2 in general. I don't know if it will generate enough very high EW galaxies for the specific LSS project, but we'll check. One further question - was the post-processing run already for the proto-DC2 catalogs? We should be able to handle combined info on SEDs and line luminosities (even better continuum-only ugrizy photometry and line luminosities).

evevkovacs commented 7 years ago

Yes, ELG post processing has been run and the line luminosities are given for selected lines. (balmer alpha and beta, OII, OIII, NII, SII)

janewman-pitt-edu commented 7 years ago

The assumption that continuum->line mapping is the same as SDSS has to break down eventually, i agree, but at least assures we match things at low redshift and that line ratios are physical (and I'd rather be underpredicting than overpredicting at high z). We'll see how Galacticus does...

cwwalter commented 7 years ago

@janewman-pitt-edu Is there a reference on predicting the emission lines from continuum? @evevkovacs said " ELGs are added in post processing, based on fluxes in 3 (continuum) H, He and O filters" I'm curious what those filters look like since (for example) the hydrogen alpha and beta lines are pretty far apart.

cwwalter commented 7 years ago

The DC1 data had 318 bins from 904AA to 20,069 AA, with linearly increasing bin-size in lambda. For DC2, I think we want something similar, but we do want to concentrate on the area around the 4000 AA break and spanning the optical with a bit more resolution.

In the current scheme, do I understand correctly you have gone from 318 to 30 bins for the continuum? Also, generically, is there a reason to represent the SEDs this way rather than say a PCA based on SED templates?

Thanks!

evevkovacs commented 7 years ago

The reference is (also see the protoDC2 document); Panuzzo, P., Bressan, A., Granato, G. L., Silva, L., Danese, L., 2003, RMxAC, 17, 89P.
Yes, we reduced from 318 to 30 because calculating many filters in Galacticus is expensive. The PZ group felt that about 30 filters would be enough for their estimates. Galacticus calculates the filter luminosities directly based on the star-formation history, SPS codes and the filter transmission function. It doesn't use SED templates.

cwwalter commented 7 years ago

The reference is (also see the protoDC2 document); Panuzzo, P., Bressan, A., Granato, G. L., Silva, L., Danese, L., 2003, RMxAC, 17, 89P.

Thanks Eve! Is it this: http://www.astroscu.unam.mx/rmaa/RMxAC..17/PDF/RMxAC..17_ppanuzzo.pdf ?

I can't quite tell if this one page conference proceedings is explaining how to do the modeling, or if this is the actual observation of the correlation between the continuum and the lines. @janewman-pitt-edu do you have a comment?

rmandelb commented 7 years ago

All: what I've gotten out of this discussion is that:

Is that a good summary? Should I add this to the list of validation tests in the validation epic, update the DC2 plan to reflect these conclusions, and close this issue - or do we have more to do here?

yymao commented 7 years ago

We've opened an issue at https://github.com/LSSTDESC/descqa/issues/12 but need some people to volunteer to work on this.

evevkovacs commented 7 years ago

@cwwalter The conference note is describing the model implementation (ie creating a big lookup table for emission-line strengths based on galaxy properties such as metallicity etc) I pinged Andrew Benson regarding a reference for the observation that line strengths can be well reconstructed from the 3 continuum filters (plus the metallicity and density of the HII regions) Here it is: http://adsabs.harvard.edu/abs/2003A%26A...409...99P I will add this information to the protoDC2 note.

katrinheitmann commented 7 years ago

I think that sounds good! In the spirit of listing all the tests in the validation epic we should add it there even though it does have an issue in the other repo, right?

Thanks very much Rachel!

On 11/8/17 4:01 PM, Rachel Mandelbaum wrote:

All: what I've gotten out of this discussion is that:

  • we want to have a validation test of the emission lines in the catalog-level sims
  • the emission line selection is definitely not relevant to image sims

Is that a good summary? Should I add this to the list of validation tests in the validation epic, update the DC2 plan to reflect these conclusions, and close this issue - or do we have more to do here?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LSSTDESC/DC2_Repo/issues/31#issuecomment-342975373, or mute the thread https://github.com/notifications/unsubscribe-auth/AMQ9jAyb5XKlzBh81XU1Pmt-gRBm8Zl9ks5s0iTFgaJpZM4QTXgH.

cwwalter commented 7 years ago

Here it is: http://adsabs.harvard.edu/abs/2003A%26A...409...99P

Thanks!

janewman-pitt-edu commented 7 years ago

For the afterburner based on continuum, it's an improved version of https://arxiv.org/abs/1601.02417 . A PCA representation of the spectra is fine for this but I don't think that's what's easy to provide from Galacticus?

cwwalter commented 7 years ago

For the afterburner based on continuum, it's an improved version of https://arxiv.org/abs/1601.02417

Great, thanks!

rmandelb commented 7 years ago

@katrinheitmann - I see Yao added this already to the proto-DC2 validation epic here, and to the list in the descqa repo. So I am closing this issue and updating our wish list to reflect this decision.

cwwalter commented 6 years ago

Hi All,

This issue is closed but I just wanted to check on something: I know these ELGs are not going to be in the image SEDs and we can add lines based on the continuum re: the papers above. But I am a bit unclear on the mechanics of how these checks will work for the catalog only work.

Will the GCR return them with an option turned on so the PZ group can use them but not when they are interfacing with CatSim?

Thanks!

cwwalter commented 6 years ago

(BTW: The thing that got me thinking about this was wondering if we should include this as a feature in table 1. of the executive summary of the planning document).

yymao commented 6 years ago

I guess we can save two sets of SEDs, one with and one without the ELGs? @evevkovacs?

cwwalter commented 6 years ago

Currently the table says "emission-line strengths are computed in post- processing."

yymao commented 6 years ago

I believe post-processing is that sentence is still before the step when the protoDC2 extragalactic catalog is written to disk.

cwwalter commented 6 years ago

Right exactly what this means is what I am confused about / trying to understand...

yymao commented 6 years ago

@evevkovacs should chime in but I think it means that after galaxies are painted by the SAM, they then add emission-line strengths, and then save everything as a file to disk.

evevkovacs commented 6 years ago

@cwwalter @yao Yao is correct. The emission lines are added in post processing after the Galacticus simulation and the information is available in the protoDC2 catalog. See here for a list of the native quantities available in the catalog.

cwwalter commented 6 years ago

OK thanks. This is the same catalog that CatSim uses but it just doesn't use those variables?

yymao commented 6 years ago

@cwwalter yes, I think so

evevkovacs commented 6 years ago

@danielsf Scott should confirm, but as far as I know, CatSim does not use the ELG information as yet. CatSim may expect that information to be included in with the SED, in which case, there may be additional work to be done here. Scott wrote a module to take the SED information supplied by Galacticus and fit it to CatSim's SED template library. Including ELG information in that fit may be a refinement on that which will be needed in the future.

danielsf commented 6 years ago

The CatSim SED library to which we are fitting does not have emission lines, so I'm not sure how to incorporate ELG parameters and ultimately pass them to PhoSim (recall that PhoSim expects there to be file on disk representing the (wavelength, flux) grid of the SED for each source).

cwwalter commented 6 years ago

That's OK. We aren't planning on doing this for the image simulations now. This ELG studies will be catalog only. I just wanted to know how that would work.