spacetelescope / mirage

This code can be used to generate simulated NIRCam, NIRISS, or FGS data
https://mirage-data-simulator.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
40 stars 41 forks source link

Use hdf5 file to construct source magnitudes in WFSS simulations #157

Open bhilbert4 opened 6 years ago

bhilbert4 commented 6 years ago

Currently if a user supplies an hdf5 file with object spectra for a WFSS simulation, mirage will use the source magnitudes from the catalog source file when making the seed image for the direct image, but will then use the spectra (which are in units of FLAM) when creating the dispersed seed image. To be more consistent, it would be nice if the source magnitudes for the imaging seed image matched those in the dispersed seed image.

Add the ability to use the spectra in the hdf5 file to compute the appropriate magnitude for the specific filter in the imaging seed image. Allow it to do this regardless of whether the end product is going to be a dispersed or an imaging seed image. In this way, the same file can be used when creating seed images for the associated direct/adjacent fov imaging data that go along with WFSS data.

KevinVolkSTScI commented 6 years ago

Bryan, would it not be more likely to have "template" spectra for the objects that need to be scaled to the specified count rates rather than scaling each individual spectrum to different brightnesses? I am not sure about how people might treat galaxies, but for stars one is likely to have templates for different spectral types and scale these to the magnitudes required. Even for galaxies although one may need to shift each individual spectrum to the red-shift of interest would someone also scale to a given count rate?

bhilbert4 commented 6 years ago

The way that @NorPirzkal's dispersing code works at the moment, it assumes that all spectra in the input hdf5 file are in units of F_lambda.

I agree that it would be nice to have the ability to input a template spectrum and have mirage scale it to the user-supplied magnitude for the given filter. The way things are at the moment, this could be added as an intermediate step. Mirage could read in the supplied hdf5 file, scale all spectra as requested, and then output a modified hdf5 file that it passes to the disperser software. I was playing around last week with setting units in the hdf5 file. It turns out the hdf5 format doesn't easily lend itself to that (at least in terms of having separate units attached to each spectrum). Perhaps we could fall back to having units in the header (implying that all spectra in the file have the same units), and have mirage convert and scale the spectra based on that.

I hadn't thought much about scaling for redshift. My assumption has been that users will get the spectra they want to use into the correct redshift and input that. But as long as we are converting units and scaling, we could always add an option to scale for redshift. We would just need a way for the user to supply the redshift associated with each spectrum.

npirzkal commented 6 years ago

Hi

I am wary of the renormalization of the input spectra. This is because the main reason for allowing for input spectra in the grism simulator is to make sure that we can simulate emission lines properly. A renormalization could affect the line flux and get a user to draw the wrong conclusion when extracting simulations. At the very least, the re-normalization should be optional. There is in principle more information in those spectra than in the broad band magnitude passed to the code so, to me, it does not seem right to use the broad band magnitude when the input spectra are already given in flambda units. To be clear, forcing a renomalizaton on the user will create cases where the code will not work as I think it should and would require a user to be either very careful, or spend time estimating the broad band magnitude when adding a spectrum to a source. Also, keep in mind that one of the way this is meant to be used it to allow for broad band spectra to be generated using photometry alone and then also allow for users to generate star forming regions only using spectra as input only (with maybe no continuum and just emission lines) so that the two simulations can then be added. All of this leads me to think that re-normalization is not what we want to allow by default.

N

On Sep 10, 2018, at 5:30 PM, Bryan Hilbert notifications@github.com wrote:

The way that @NorPirzkal https://github.com/NorPirzkal's dispersing code works at the moment, it assumes that all spectra in the input hdf5 file are in units of F_lambda.

I agree that it would be nice to have the ability to input a template spectrum and have mirage scale it to the user-supplied magnitude for the given filter. The way things are at the moment, this could be added as an intermediate step. Mirage could read in the supplied hdf5 file, scale all spectra as requested, and then output a modified hdf5 file that it passes to the disperser software. I was playing around last week with setting units in the hdf5 file. It turns out the hdf5 format doesn't easily lend itself to that (at least in terms of having separate units attached to each spectrum). Perhaps we could fall back to having units in the header (implying that all spectra in the file have the same units), and have mirage convert and scale the spectra based on that.

I hadn't thought much about scaling for redshift. My assumption has been that users will get the spectra they want to use into the correct redshift and input that. But as long as we are converting units and scaling, we could always add an option to scale for redshift. We would just need a way for the user to supply the redshift associated with each spectrum.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/spacetelescope/mirage/issues/157#issuecomment-420067895, or mute the thread https://github.com/notifications/unsubscribe-auth/AHNknDrqjMywXXeB2Yr1b3wRuo1uu9pXks5uZtnqgaJpZM4WRs__.

KevinVolkSTScI commented 6 years ago

For the NIRISS and Guider cases, the magnitude is defined in a way that is scaled from the count rate independent of the spectral shape (including emission line spectra). Hence there is no issue with renormalization of the spectrum to a magnitude or count rate value as given in the photometric catalogue for the direct image as long as the user understands how the photon-weighted mean flux density relates to the count rates. All the values required are defined, and documented in detail for NIRISS. Any user with a model spectrum can follow the prescription for calculation of the mean flux density in either wavelength or frequency units and thereby obtain a count rate estimate and a magnitude that will work with mirage given the input values I provided.

I would hope that the same is true for NIRCam, but while I provided the magnitude definitions used in mirage for NIRISS and the Guider I was not involved in the generation of the values for NIRCam so I do not know if these are consistent with the other two instruments. This is all something that was discussed at length in the absolute calibration working group, and resulted in a recommendation about how to define magnitudes, mean flux density values, and the relation between these quantities and the observed count rates (or simulated count rates, given that we have no observations yet.

Irrespective of the above, I certainly agree that renormalization should be optional. The user should always have the option to get the code to use the inputs are provided without the code overriding it.

NorPirzkal commented 6 years ago

The issue I was trying to describe is not the definition of magnitudes but of allowing a user to use arbitrary spectra in physical units without having to compute broad band magnitudes, which is not necessary in some case (and useless in others such as pure emission line dispersed differential simulations).

As long as we make it optional, it will be fine.

Going back to a previous point made in this thread, I will expand on this a bit: Scaling templates for stars does make sense. However, for galaxies, we do not always want to use simple template spectra that are simply scaled. Most simulations we are doing to the ERS include multiple components: broad band continuum, star forming regions (with their own morphologies) which are dominated by star formation and emission lines and whose important observables are the individual line fluxes with fluxes derived from ionization models. The way we create these complex simulations is by creating separate simulations for the broad band components only and then for the emission line regions only, which, after being dispersed, can be combined to create much more realistic cases that take into account the fact that emission line regions are distributed over the whole object. It is useful to have the broadband component be simply based on photometry (while they could also be based on model spectra such as bc03, devoid of nebular lines, and which could be scaled by the simulator to some input magnitude value, or kept as is if already generated for a given galaxy stellar mass). In the case of the emission line regions, scaling is best left off.

N

On Sep 10, 2018, at 6:20 PM, Kevin Volk notifications@github.com<mailto:notifications@github.com> wrote:

For the NIRISS and Guider cases, the magnitude is defined in a way that is scaled from the count rate independent of the spectral shape (including emission line spectra). Hence there is no issue with renormalization of the spectrum to a magnitude or count rate value as given in the photometric catalogue for the direct image as long as the user understands how the photon-weighted mean flux density relates to the count rates. All the values required are defined, and documented in detail for NIRISS. Any user with a model spectrum can follow the prescription for calculation of the mean flux density in either wavelength or frequency units and thereby obtain a count rate estimate and a magnitude that will work with mirage given the input values I provided.

I would hope that the same is true for NIRCam, but while I provided the magnitude definitions used in mirage for NIRISS and the Guider I was not involved in the generation of the values for NIRCam so I do not know if these are consistent with the other two instruments. This is all something that was discussed at length in the absolute calibration working group, and resulted in a recommendation about how to define magnitudes, mean flux density values, and the relation between these quantities and the observed count rates (or simulated count rates, given that we have no observations yet.

Irrespective of the above, I certainly agree that renormalization should be optional. The user should always have the option to get the code to use the inputs are provided without the code overriding it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/spacetelescope/mirage/issues/157#issuecomment-420080609, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlQfMeC8jqKJgVHC5JtS9dvfMgICfCJoks5uZuWygaJpZM4WRs__.

KevinVolkSTScI commented 6 years ago

There is no problem in taking a spectrum and estimating the NIRISS filter count rates in mirage. Some checks would need to be made with my own codes that do this to make sure that things are correct. One can also use the ETC for this, but there are differences between the ETC and my code that remain unexplained even with the same input spectrum and (presumably) the same throughput values. Magnitudes come in only in that the image catalogue input files use magnitudes rather than the raw count rates. That was done originally in my ramp simulation code so that one could take the observed magnitudes of objects and use these to estimate NIRISS magnitudes and thereby get a count rate estimate...a process which I admit is not easy to do (or not possible at all) for emission line objects.

bhilbert4 commented 5 years ago

@KevinVolkSTScI @NorPirzkal @jotaylor My plan is to start some of this work this week.

From the discussion above, my thought is to try and implement things as follows. Let me know if you think something should be different/added/subtracted. The underlying basic change from how things work now is that an hdf5 file containing spectra for all sources will be a required input to the disperser. This hdf5 file can be provided by the user, or constructed by Mirage from the user-provided catalog files.

Two options for inputs:

1. User inputs only a catalog (as is done now).

In this case, Mirage will parse the catalog and interpolate from the given magnitude columns to construct a spectrum for each object. These spectra will be placed into an hdf5 file and passed to the disperser code.

2. User inputs a catalog and an hdf5 file

The hdf5 file can contain spectra for some or all sources. Note that an hdf5 file can also be an input for imaging mode data.

A. Sources that are not present in the hdf5 file

Will be treated as in situation 1 above.

B. Sources that are present in both the catalog and hdf5 file:

If the magnitude values in the catalog are left empty (or filled with a placeholder of some sort) then the spectrum from the hdf5 file will be passed unchanged into the disperser (for WFSS) or converted to magnitudes in the appropriate filter (for imaging).

If the magnitudes in the catalog have valid values, then the spectrum from the hdf5 file will be normalized to match the given magnitude (which magnitude? Only that in crossing filter? That from any filter? What if magnitudes in multiple filters are given?) and then passed to the disperser (for imaging we simply use the magnitude from the catalog and ignore the spectrum?).

C. Sources that are present only in the hdf5 file:

This situation is not supported since there is no location information for the source

3. User inputs only an hdf5 file

This situation cannot be supported because there will be no RA, Dec information for the sources.

NorPirzkal commented 5 years ago

Hi

I think that case 3 is not really true, because we can have a segmentation file and imaging and and hdf5 file and there is no need for a catalog in that case.

N

On Dec 11, 2018, at 1:41 PM, Bryan Hilbert notifications@github.com<mailto:notifications@github.com> wrote:

@KevinVolkSTScIhttps://github.com/KevinVolkSTScI @NorPirzkalhttps://github.com/NorPirzkal @jotaylorhttps://github.com/jotaylor My plan is to start some of this work this week.

From the discussion above, my thought is to try and implement things as follows. Let me know if you think something should be different/added/subtracted. The underlying basic change from how things work now is that an hdf5 file containing spectra for all sources will be a required input to the disperser. This hdf5 file can be provided by the user, or constructed by Mirage from the user-provided catalog files.

Two options for inputs:

  1. User inputs only a catalog (as is done now).

In this case, Mirage will parse the catalog and interpolate from the given magnitude columns to construct a spectrum for each object. These spectra will be placed into an hdf5 file and passed to the disperser code.

  1. User inputs a catalog and an hdf5 file

The hdf5 file can contain spectra for some or all sources. Note that an hdf5 file can also be an input for imaging mode data.

A. Sources that are not present in the hdf5 file

Will be treated as in situation 1 above.

B. Sources that are present in both the catalog and hdf5 file:

If the magnitude values in the catalog are left empty (or filled with a placeholder of some sort) then the spectrum from the hdf5 file will be passed unchanged into the disperser (for WFSS) or converted to magnitudes in the appropriate filter (for imaging).

If the magnitudes in the catalog have valid values, then the spectrum from the hdf5 file will be normalized to match the given magnitude (which magnitude? Only that in crossing filter? That from any filter? What if magnitudes in multiple filters are given?) and then passed to the disperser (for imaging we simply use the magnitude from the catalog and ignore the spectrum?).

C. Sources that are present only in the hdf5 file:

This situation is not supported since there is no location information for the source

  1. User inputs only an hdf5 file

This situation cannot be supported because there will be no RA, Dec information for the sources.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/spacetelescope/mirage/issues/157#issuecomment-446313629, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlQfMU5mz1ScYJ8UUaJuzPhKkjEwDtV4ks5u3_xBgaJpZM4WRs__.

NorPirzkal commented 5 years ago

My $0.02 about 2.B: It can be useful to have a catalog with magnitudes and and hdf5 while a user might not want to renormalize the hdf5 spectra. Could we possibly make the re-normalization step optional?

N

On Dec 11, 2018, at 1:41 PM, Bryan Hilbert notifications@github.com<mailto:notifications@github.com> wrote:

@KevinVolkSTScIhttps://github.com/KevinVolkSTScI @NorPirzkalhttps://github.com/NorPirzkal @jotaylorhttps://github.com/jotaylor My plan is to start some of this work this week.

From the discussion above, my thought is to try and implement things as follows. Let me know if you think something should be different/added/subtracted. The underlying basic change from how things work now is that an hdf5 file containing spectra for all sources will be a required input to the disperser. This hdf5 file can be provided by the user, or constructed by Mirage from the user-provided catalog files.

Two options for inputs:

  1. User inputs only a catalog (as is done now).

In this case, Mirage will parse the catalog and interpolate from the given magnitude columns to construct a spectrum for each object. These spectra will be placed into an hdf5 file and passed to the disperser code.

  1. User inputs a catalog and an hdf5 file

The hdf5 file can contain spectra for some or all sources. Note that an hdf5 file can also be an input for imaging mode data.

A. Sources that are not present in the hdf5 file

Will be treated as in situation 1 above.

B. Sources that are present in both the catalog and hdf5 file:

If the magnitude values in the catalog are left empty (or filled with a placeholder of some sort) then the spectrum from the hdf5 file will be passed unchanged into the disperser (for WFSS) or converted to magnitudes in the appropriate filter (for imaging).

If the magnitudes in the catalog have valid values, then the spectrum from the hdf5 file will be normalized to match the given magnitude (which magnitude? Only that in crossing filter? That from any filter? What if magnitudes in multiple filters are given?) and then passed to the disperser (for imaging we simply use the magnitude from the catalog and ignore the spectrum?).

C. Sources that are present only in the hdf5 file:

This situation is not supported since there is no location information for the source

  1. User inputs only an hdf5 file

This situation cannot be supported because there will be no RA, Dec information for the sources.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/spacetelescope/mirage/issues/157#issuecomment-446313629, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlQfMU5mz1ScYJ8UUaJuzPhKkjEwDtV4ks5u3_xBgaJpZM4WRs__.

bhilbert4 commented 5 years ago

Good point. We can add a segmentation file as an optional input then.

As for 2B: If renormalization is optional and a user chooses not to do it, then I'm assuming catalog magnitudes are ignored even if they are inconsistent with the spectra? This seems like a situation where a warning should be raised.

KevinVolkSTScI commented 5 years ago

I suppose one should normalize to the blocking filter magnitude, assuming that it is given, or to the magnitude for the filter closest in wavelength to the blocking filter effective wavelength.

If a user selects the "no renormalization" option one could, say, calculate a magnitude to compare with one of the input magnitudes (if there are any) and raise a warning if the deviation is larger than say 10%. Although there is still the issue of which filter to use for the comparison.

I guess the case to watch for is where a user inputs a normalized spectrum but then forgets to provide the correct magnitude for normalization. Are there default units for the spectra that we can tell are clearly bad if the values are of order 1? If the spectrum is F_lambda in W/m^2/micron or in erg/s/cm^2/Angstrom then the values cannot be of order 1. Should be require that the input spectrum be in F_lambda?

bhilbert4 commented 5 years ago

I looked a bit into the hdf5 file format a while ago, and unfortunately you can't specify astropy units to go with each input spectrum. But, you can specify a string piece of metadata for each spectrum. I had been thinking that Mirage could look at this string to see if it specifies units and then translate the data accordingly. And if no units are specified, then assume that the units are F_lambda, since this is what the disperser ultimately wants.

KevinVolkSTScI commented 5 years ago

It may be better to require some particular units for the spectra, in the event that they are not going to the normalized to a magnitude. Of course one then immediately has a conflict between infrared/MKS unit people like me who want W/m^/micron and optical/CGS people who probably want erg/s/cm^2/A units. I do not think that using a metadata value is a good idea unless we define and enforce a small number of possible spectral labels that could be used. That may be more work than you need. I would suggest defining one set of units and enforcing this when the no remormalization option is used. Wavelengths in microns and F_lambda in W/m^2/micron would be my preference. I dislike the Spitzer approach of wavelength in microns and spectrum in Jansky because you need to convert to integrate to get the flux.

NorPirzkal commented 5 years ago

micron and erg/s/cm^2/A please.

N

On Dec 11, 2018, at 3:11 PM, Kevin Volk notifications@github.com<mailto:notifications@github.com> wrote:

It may be better to require some particular units for the spectra, in the event that they are not going to the normalized to a magnitude. Of course one then immediately has a conflict between infrared/MKS unit people like me who want W/m^/micron and optical/CGS people who probably want erg/s/cm^2/A units. I do not think that using a metadata value is a good idea unless we define and enforce a small number of possible spectral labels that could be used. That may be more work than you need. I would suggest defining one set of units and enforcing this when the no remormalization option is used. Wavelengths in microns and F_lambda in W/m^2/micron would be my preference. I dislike the Spitzer approach of wavelength in microns and spectrum in Jansky because you need to convert to integrate to get the flux.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/spacetelescope/mirage/issues/157#issuecomment-446345103, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlQfMVuNhhO1jxZyp7oa5woHc4aB7c32ks5u4BF-gaJpZM4WRs__.

KevinVolkSTScI commented 5 years ago

As predicted, Nor wants erg/s/cm^2/Angstrom. All my spectra in the infrared get converted to W/m^2/micron. On what basis do we decide which should be the default?

NorPirzkal commented 5 years ago

Then support both please.

N

On Dec 11, 2018, at 3:24 PM, Kevin Volk notifications@github.com<mailto:notifications@github.com> wrote:

As predicted, Nor wants erg/s/cm^2/Angstrom. All my spectra in the infrared get converted to W/m^2/micron. On what basis do we decide which should be the default?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/spacetelescope/mirage/issues/157#issuecomment-446348955, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlQfMZXXDugfzgV5uZy2i-lkV4cE_g-Zks5u4BRvgaJpZM4WRs__.

KevinVolkSTScI commented 5 years ago

In that case, can you suggest how to flag one case or the other in the HDF5 file? Having two cases should not add too much work (easy for me to say when I am not doing the work!).

jotaylor commented 5 years ago

I haven't read through this whole thread yet but you can easily convert between erg/s/cm^2/Angstrom and W/m^2/micron using astropy.units and units.spectral_density

NorPirzkal commented 5 years ago

Right now, I record the units as an hdf5 attribute to the dataset.

On Dec 11, 2018, at 3:29 PM, Kevin Volk notifications@github.com<mailto:notifications@github.com> wrote:

In that case, can you suggest how to flag one case or the other in the HDF5 file? Having two cases should not add too much work (easy for me to say when I am not doing the work!).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/spacetelescope/mirage/issues/157#issuecomment-446350450, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlQfMYgbiQN_9YfFGeg_zBu-HgW9J5FQks5u4BWZgaJpZM4WRs__.

bhilbert4 commented 5 years ago

I wrote out a more detailed flow for how to integrate all of this into the seed image generator. It's still pretty high level, but it at least gives an idea of what new functionality is needed and roughly where.

At the moment, it looks like I need to add several functions:

  1. renormalize_spectrum(to_mag, in_filter, magsys)
  2. calculate_mag_from_SED(sed, filtername, magsys)
  3. interpolate_filter_mags_to_create_SED(mag_list, filter_list, magsys)

My plan was to cobble together the pieces for 3 from what is currently in NIRCAM_Gsim. But I'm sure there will also be overlap here with development in the catalog branch at the moment. @KevinVolkSTScI, do you think you will have well defined functions that do any of the above as part of your catalog work? If so it seems like I should do this development such that I call those functions.

KevinVolkSTScI commented 5 years ago

I am working on the interpolation from "standard" magnitudes to JWST ones now for the catalogue generation, and part of that could be used for item (3). However, it would probably be better to use existing numpy routines for that. I obviously have code for items (1) and (2) but I do not think that should be put inside Mirage. So I do not think I have anything useful for you.

A related question is which filters will you allow for the normalization? If you only use the NIRCam/NIRISS/Guider filters then the photometric and filter response values already in the $MIRAGE_DATA directories can be used as inputs and you just need to calculate the proper photon-weighted flux density values from the spectrum (which you need because one might have say an emission line spectrum as input and that would normalize things properly). Maybe pysynphot can do that for you, but if not I can write the code for it if you like.

bhilbert4 commented 5 years ago

I think it makes sense to limit normalization options to the NIRCam/NIRISS/Guider filters at least to start with. Maybe once everything is working we can add other filters.

Reading up on some astropy unit details, I came across the page on unit equivalencies. http://docs.astropy.org/en/stable/units/equivalencies.html There is a section on Spectral Flux / Luminosity Density Units. I'm just leaving this here for reference in case it's useful later.