Galaxy Line Emission & Absorption Modeling
Citation: Andra Stroe and Victor-Nicolae Savu 2021 AJ 161 158
gleam is a Python package for fitting Gaussian models to emission and absorption lines in large samples of 1D galaxy spectra. gleam is tailored to work well without much human interaction on optical and infrared spectra in a wide range of instrument setups and signal-to-noise regimes. gleam will create a fits table with Gaussian line measurements, including central wavelength, width, height and amplitude, as well as estimates for the continuum under the line and the line flux, luminosity, equivalent width and velocity width. gleam will also, optionally, make plots of the spectrum with fitted lines overlaid.
gleam fits lines in 1D spectra using redshift information from a metadata file and several other parameters from a central configuration file.
To run gleam, the following are needed:
Details on the input files can be found further down.
The outputs of gleam include:
To run the gleam using the defaults, you can type in the terminal:
gleam
gleam has a number of optional command line arguments. For details type:
gleam --help
An example dataset is contained within the git repository. To download it, either use the download button or in the terminal:
wget https://github.com/multiwavelength/gleam/raw/main/example.tar.gz
The input spectra should be in fits format, ideally with units in the headers. They should contain 3 columns: the observed wavelength, the flux and the error, as follow:
wl | flux | stdev |
---|---|---|
8972.34 | 0.1 | 0.01 |
⋮ | ⋮ | ⋮ |
Note: gleam assumes that the spectrum in given in the optical convention, i.e. wavelength (λ) versus flux density (Fλ). If the spectral axis is given in frequencies (ν), or if the y axis is given in λFλ, Fν, or νFν, gleam will not produce the intended results.
In order to identify source across the spectra and the metadata files, a naming convention needs to be followed:
spec1d.Sample.Setup.Pointing.SourceNumber.fits
The metadata file contains information about individual sources in the project, such as the setup and pointing they were observed with, the source number to identify them and their redshift. The metadata file is used to pull information about each source. You can have a single metadata file or multiple ones, as long as sources are unique between them.
The metadata file can be in fits format or ASCII format (with commented header), but should contain the following columns:
Setup | Pointing | SourceNumber | Sample | Redshift |
---|---|---|---|---|
Keck | P1 | 123 | Cosmos | 1.2303 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
Column descriptions:
Metadata files must start with "meta.":
meta.Sample.Setup.Pointing.fits
meta.Sample.Setup.Pointing.dat
meta.fits
The configuration file enables the user to customize the fitting at 4 levels: use the gleam defaults as much as possible, use a global set of parameters for an entire project, specify a number of telescope/instrument specific overrides or even specify overrides for individual sources.
A minimal working configuration example:
globals:
line_table: line_lists/Main_optical_lines.fits
resolution: 4.4 Angstrom
To report luminosities based on the fitted models, gleam uses Hubble
constant, Omega0 and the CMB temperature. While these parameters are reasonably
accurate and up to date, your project may require slightly different values. You
can use the cosmology
section to override one or more of these parameters.
The same cosmology is going to be used consistently across all the spectra
within a project.
Here is another full example of a configuration file demonstrating how you could define custom cosmological parameters for your project (the values here happen correspond to the defaults):
globals:
line_table: line_lists/Main_optical_lines.fits
resolution: 4.4 Angstrom
cosmology:
H0: 75 km / (Mpc s)
Om0: 0.3
Tcmb0: 2.725 K
The configuration has a few more fully customizable parameters, related to model fitting, line selection and sky absorption masking.
With gleam, you can analyze large numbers of spectra in a uniform manner, even with data taken in different conditions, with different instruments on different telescopes and for a wide variety of sources. To make it easy to capture the specifics of each spectrum, gleam offers you the possibility to specify parameters at three different levels.
globals
) allows you to override the default configuration
for all the spectra. This is the most coarse level of customization while the
next levels provide more fine-grained overrides.setups
). Each spectrum belongs to exactly one
setup. What setups mean is entirely up to you. In general, you would use this
level to capture differences between telescopes or instruments, such as the
spectral resolution. The configuration parameters specified at this level
supersede the the global configuration and the built-in defaults. Note that
the setup name needs to match that in the corresponding sources.sources
) allows you to customize the parameters for each and
every source. While this can be very helpful to account for some particularly
troublesome cases, it should be used sporadically both due to the associated
typing burden as well as in the spirit of keeping the results comparable. The
naming convention of the source should be in line with the input spectrum
file, without the 'spec1d' and '.fits'. For example:
The full structure of the configuration file is:
globals:
<global overrides> ...
setups:
<setup name>:
<per-setup overrides> ...
<setup name>:
<per-setup overrides> ...
sources:
<source locator>:
<per-source overrides> ...
cosmology:
H0: 75 km / (Mpc s)
Om0: 0.3
Tcmb0: 2.725 K
The parameters for each spectrum will be computed by stacking the applicable overrides on top of the default in order: first the global overrides, then the applicable per-setup overrides (if any) and finally the applicable per-source overrides.
Here are the parameters that can be overridden at either the global, setup or source level.
There are two parameters that you can use to control whether the fitting should ignore portions of the spectrum where sky bands may not have been reliably subtracted.
First is the path to a catalog which defines the wavelength intervals of sky bands. The file it points to must be in the fits file format. See below for exact details of how to create this file.
sky: line_lists/Sky_bands.fits
The second is a flag that enables or disables the masking of all the sky bands in the spectrum.
mask_sky: True
By default (i.e. if no sky
or mask_sky
overrides are applied to a source),
there is no sky masking and the entire spectrum is used. In order for masking to
take place, sky
must be set appropriately and mask_sky
must be set to True
.
Note that the two overrides don't need to be specified at the same level. For
example, you might want to specify sky
at the global
level and then just set
mask_sky
to True
for individual sources
(or setups
) for which the sky
subtraction is inadequate.
By default, it will fit all lines listed in the line table. Otherwise, only the lines specified under "lines" will be fitted. One needs to use the same names for the lines as in the line table.
line_table: line_lists/Main_optical_lines.fits # line catalog
lines: # names of line to select from line table
- Hb
- OIII4
- OIII5
- OII
- Ha
- NII1
- SII1
- SII2
There are a number of parameters that each affect the way the line models are
fit to the data. To distinguish them visually in the configuration file, they
are grouped under the fitting
field. However, they can be individually
overridden at any of the three levels (global, per-setup and per-source).
fitting:
# Signal to noise limit for a detection
SN_limit: 2
# Tolerance for considering emission lines part of a group and thus fitting them together
tolerance: 26.0 Angstrom
# Range around either side of the Gaussian center to be probed
w: 3.0 Angstrom
# Wavelength range used to mask lines
mask_width: 20.0 Angstrom
# Range used for selecting continuum left and right of the source
cont_width: 70.0 Angstrom
# Constraints on the center of each gaussian. The options are:
# - free: (default) the center can be anywhere within the fitting range
# - constrained: the center must be within a distance of `w` from the expected
# position specified in the `line_table`
# - fixed: the center is fixed to the expected position of the corresponding
# line as specified in the `line_table`
center: constrained
The line catalog contains a list of lines to draw from when fitting. It should be in the fits format (preferably with units) and contain columns for the line name, wavelength and (optionally) the LaTeX representation of the line name (which is only used when plotting).
line | wavelength | latex |
---|---|---|
Ha | 6564.614 | H$\boldsymbol{\alpha}$ |
NII1 | 6585.27 | [N{\sc ii}] |
⋮ | ⋮ | ⋮ |
A subset of the lines can be specified in the configuration file, otherwise the entire list of lines from the catalog will be used for fitting.
In cases where the sky correction is not done perfectly, your data may still be affected by sky absorption or emission. You can specify a list of bands (in observed wavelength units) to avoid. These bands will be masked and disregarded for fitting and treated as if no spectral coverage is available. (i.e. no upper limits will be reported). Masking of the sky can be turned on and off in the config file.
The sky catalog must be in the fits format (preferably with units) and contain the following columns:
band | wavelength_min | wavelength_max |
---|---|---|
Aband | 7586.0 | 7658.0 |
Bband | 6864.0 | 6945.0 |
⋮ | ⋮ | ⋮ |
For each of the sources in your sample, gleam will produce a table with all of the line fits and upper limits (if possible with units derived from the input data). Each line fitted is represented in a separate row, with all the corresponding line fit details contained in different column. The table contains information from the expected wavelength of the line and the redshift of the source, to emission line fit parameters, line fluxes and equivalent widths.
All of the output files will start with "linefits" and follow the naming convention described above.
An example of the header of such a table and a description of the columns can be found below. Since there are many column, they are listed here in multiple groups.
line | wavelength | latex | z | zline | zline_err |
---|---|---|---|---|---|
Ha | 6564.614 | H$\boldsymbol{\alpha}$ | 0.26284 | 0.2629235 | 3.3654374E-5 |
zoffset | zoffset_err | cont | cont_err | wl | wl_err |
---|---|---|---|---|---|
8.3521445E-5 | 3.3654374E-5 | 0.0012948853 | 1.11991896E-4 | 6565.162 | 0.22092798 |
height | height_err | sigma | sigma_err | amplitude | amplitude_err |
---|---|---|---|---|---|
0.017354544 | 0.001047185 | 2.9371586 | 0.2215329 | 0.12777047 | 0.008491282 |
flux | flux_err | luminosity | luminosity_err | EWrest | EWrest_err |
---|---|---|---|---|---|
0.12777047 | 0.008491282 | 0.27209833 | 0.018082924 | 98.673195 | 10.762495 |
FWHM | FWHM_err | v | v_err | detected | covered |
---|---|---|---|---|---|
6.9164796 | 0.5216701 | 118.18059 | 23.823605 | true | true |
A description of each column:
If the input spectrum has units, the line parameters will also be reported with units. When the spectral line is not covered by the spectrum, fit values and errors are omitted. If a line is not detected, gleam only reports an upper limit in the amplitude column and omits all other parameters. The FWHM and the velocity are only reported if the line is spectrally resolved.
If plotting is enabled, gleam produces two types of figures: a figure showing the entire spectrum with zoom-ins on the emission line fits. The second type of plots are focused on each line fit. Areas masked by sky are shaded gray for clarity.
NOTE: Plotting high quality figures makes gleam very slow (a factor of at least 15 slower than without it). Matplotlib with Latex has some memory leak issues, which can cause gleam to slowly consume all the memory. I recommend avoiding batch processing more than 500 sources when also creating plots.
The recommended way to get gleam
is from the Python Package Index (PyPI). The package is named astro-gleam
on PyPI:
pip install astro-gleam
In rare cases, when you need to install a version that is not published on PyPI, you can install gleam
directly from the source repository:
pip install git+https://github.com/multiwavelength/gleam
You can learn about what options are available when installing from source by reading the official documentation.
If you get an error that contains the following message, it means that your Linux system is missing the C libraries for your version of python.
fatal error: Python.h: No such file or directory
4 | #include "Python.h"
| ^~~~~~~~~~
compilation terminated.
To fix the issue, you need to install the dev package for your version of python. Different Linux distributions have slight differences in naming, but here is what the command would look like on a Debian-based distribution (e.g. Ubuntu, Mint) for python 3.10:
sudo apt install libpython3.10-dev
After this, the command to install gleam should work.
If you use gleam in your published projects or as a dependency in your code, please include a citation to the companion paper in the Astronomical Journal, as well as a citation to the repository through Zenodo:
Citation: Andra Stroe and Victor-Nicolae Savu 2021 AJ 161 158