Closed jchiang87 closed 3 years ago
Checking the varParamStr
column for all of the entries in our star db file, /global/projecta/projectdirs/lsst/groups/SSim/DC2/dc2_stellar_healpixel.db
, it appears that there are 3 models for stellar variability that we are using, and entries where varParamStr == None
, which I assume means a non-variable star. Here is the breakdown:
Model | Number of entries |
---|---|
'MLT' | 11296954 |
'kplr' | 8366862 |
'applyRRly' | 1196 |
'None' | 604961 |
Total | 20269973 |
The different models are implemented in https://github.com/lsst/sims_catUtils/blob/master/python/lsst/sims/catUtils/mixins/VariabilityMixin.py, including some we aren't using.
This is useful. I was actually looking at the /global/cscratch1/sd/descim/star_truth/star_truth_summary.db
and it has the following columns
CREATE TABLE column_descriptions
(name text, description text, dtype text);
CREATE TABLE truth_summary
(id TEXT, host_galaxy BIGINT, ra DOUBLE, dec DOUBLE,
redshift FLOAT, is_variable INT, is_pointsource INT,
flux_u FLOAT, flux_g FLOAT, flux_r FLOAT,
flux_i FLOAT, flux_z FLOAT, flux_y FLOAT,
flux_u_noMW FLOAT, flux_g_noMW FLOAT, flux_r_noMW FLOAT,
flux_i_noMW FLOAT, flux_z_noMW FLOAT, flux_y_noMW FLOAT);
For the table you are referring to, it has the following schema:
simobjid int, htmid_6 int, ra real, decl real,
gal_l real, gal_b real, magNorm real,
mura real, mudecl real, parallax real,
ebv real, radialVelocity real, varParamStr text,
sedFilename text,
umag real, gmag real, rmag real, imag real,
zmag real, ymag real, hpid int);
What is the relationship between them?
Are these linked by the stars.simobjid==truth_summary.id
?
Or I should not use the first table?
What is the relationship between them?
The tables in star_truth_summary.db
are derived from the info in dc2_stellar_healpixel.db
Are these linked by the
stars.simobjid==truth_summary.id
?
Yes, that's right. Unfortunately, the instance catalog (and centroid file) ids have a further encoding: uniqueId = stars.simobjid*1024 + 4
, presumably to mirror the uniqueId
construction for the separate galaxy components. We were planning to update the star_truth_summary.db
file with the instance catalog ids, but if it is more useful to match ids in the dc2_stellar_healpixel.db
file, it's probably better to leave the star_truth_summary.db
file as-is. Comments welcome on this!
Or I should not use the first table?
I think it's ok to use either table.
I've created an sqlite3 db table with the mean and standard deviations of the delta_mag
values produced by the lsst_sims stellar variability code. The sqlite3 file at NERSC is
/global/cscratch1/sd/jchiang8/desc/Run2.2i/stellar_variability/merged_star_db/star_lc_stats.db
and it contains a table with this schema:
CREATE TABLE stellar_variability
(id TEXT, model TEXT, mean_u, mean_g, mean_r,
mean_i, mean_z, mean_y, stdev_u, stdev_g,
stdev_r, stdev_i, stdev_z, stdev_y);
The id
values are the same as in /global/cscratch1/sd/descim/star_truth/star_truth_summary.db
. Here are hexbin plots of mean(delta_mag_i)
(=mean_i
) vs std(delta_mag_i)
(=stdev_i
) for each of the three non-constant models:
and randomly-selected example light curves for the
kplr
and applyRRly
models:
Here is the code to produce the
stellar_variability
table.
@BrunoSanchez Let me know if this looks useful! Suggestions welcome.
The catalog that contains the stellar parameters that we've been using for DC2, /global/projecta/projectdirs/lsst/groups/SSim/DC2/dc2_stellar_healpixel.db
, covers a much larger area than the DC2 300 sq deg region. Here's a hexbin plot of the ra, decl
values in that file:
The dashed line is the DC2 boundary. Since our instance catalogs use a radius of 2.1 degrees, we generate data outside of the DC2 boundary by that amount, so I've defined the dotted region, whose boundary is at least 2.1 degrees outside of the DC2 region. There are
6883094
stars in that dotted region. For the catalogs I'll be pointing to later today, I've restricted the data to those objects.
Thanks so much Jim. I just saw this, sorry. I will check this out and let you know.
I've prepared two new files, both in /global/homes/j/jchiang8/scratch/desc/Run2.2i/stellar_variability
:
star_lc_stats_trimmed.db
has the same schema and column values as the star_lc_stats.db
file I mentioned in my previous comment, but down-selected to include only the 6883094
stars in the dotted region in the above figure.star_variability_truth.db
has the per visit delta flux values (in nJy) for the stars from star_lc_stats_trimmed.db
that have stdev
values greater than 1 mmag in any band. Applying that cut yields 755468
stars and 1209321593
(1.2e9) flux entries. For the stellar_variability_truth
table, the schema is
CREATE TABLE stellar_variability_truth
(id TEXT, obsHistID INTEGER, MJD FLOAT, bandpass TEXT,
delta_flux FLOAT);
I plan to update that file with indexing on id
and obsHistID
. I haven't tried to use it to assess access performance. We may wish to consider different formatting for the light curve data.
Hi Jim, I have found that the star standard deviations are identycal for the 4 bandpasses. This might be an error. Still, though not an urgent matter, would be nice to have those values. Thanks.
They will be identical for the kplr
models (see the light curve plot above). Is that also true for the applyRRLy
stars?
FWIW, here is the implementation for the kplr
stars where the delta_mag
values for all six bands are set identically: https://github.com/lsst/sims_catUtils/blob/master/python/lsst/sims/catUtils/mixins/VariabilityMixin.py#L1332
Ok sorry. This is correct. If kplr
stands for transits then variability should be achromatic, I forgot about this.
I know where the confusion is coming. When I join the stellar_variability
stats table, with the truth_summary
table, filtered by is_variable==1
, using the column id
, then the only models that I get crossmatched are MLT
and kplr
.
Is there an obvious reason for this?
There should be applyRRly
stars that are also matched. The code that sets the is_variable
flag just looks for varParamStr
which does exist for the applyRRly
objects. I'll look into it.
Well, I have counted only 371 RRLyr stars. And none of them falls in the box where I am working on. So this explains it. Sorry for bothering you!
These tables have been generated and are available via Postgres at NERSC. An example notebook showing how to access those tables is available. That notebook is also linked to the DC2 Data Product Overview page.
We will have variability truth tables for stars that report the model fluxes in each visit, but it would be useful to have a table that summarizes the variability properties for the stars as well. This issue will be used to gather input on the table columns to provide and to track the implementation and delivery.