LSSTDESC / DC2-production

Configuration, production, validation specifications and tools for the DC2 Data Set.
BSD 3-Clause "New" or "Revised" License
11 stars 7 forks source link

What model of AGN variability will we use for DC2? #55

Closed jchiang87 closed 5 years ago

jchiang87 commented 6 years ago

In this comment, Eve notes that the galaxy catalogs produced by the CosmoSims code have black hole mass and accretion rate columns. In the subsequent comment, Scott says it will take some work to incorporate those parameters into a model of variability for the AGNs that CatSim would serve up. However, table 2 of the DC2 planning document (read-only) refers to a "damped random walk model" which must be something else.

@drphilmarshall Thoughts on what we should do for (presumably non-lensed) AGNs?

dannygoldstein commented 6 years ago

Also relevant for #59.

danielsf commented 6 years ago

It's an old joke that whenever I meet someone new at a meeting, they introduce themselves by saying "I have a better AGN variability model than you do"....

Currently, CatSim models AGN using the damped random walk model from MacLeod et al (2010; ApJ 721, 1014), which Zeljko gave us (Zeljko being MacLeod's PhD advisor). This model is parametrized in terms of the structure function of the random walk*. We could certainly implement a different variability model that depends on the parameters in the DC2 catalog. We could also find a mapping between blackhole accretion rate and the structure function. I am not an expert in any of the underlying physics, but someone in this collaboration must be.

*For those who, like me did not know anything about damped random walks before having this conversation for the first time, the structure function is the asymptotic RMS variability of the AGN on long time scales.

jchiang87 commented 6 years ago

In my previous life, I wrote a bunch of papers on AGN variability. I don't claim to have a better model, but I can help sort out something that makes sense for DC2.

cdfassnacht commented 6 years ago

One thing that we may want to consider is that, if we have some model that is not a random walk but is one that we think is OK, we may want to stick in some sources with variability generated using that model. I have no idea if this is feasible, and it may make the complexity of the process too high. However, I worry when we are using the same model to generate things (in this case light curves, eventually) that we might use to evaluate them. Maybe this isn't a big deal here, but the underlying unease that I have is that if we tune our analysis code to produce the best results from these simulations we may be losing some ability once we are confronted with real data. As I say, it's not clear to me that this is actually a worry, but it would be nice to see if our codes deal well with data generated by multiple models.

On Mon, Dec 18, 2017 at 9:40 AM, James Chiang notifications@github.com wrote:

In my previous life, I wrote a bunch of papers on AGN variability. I don't claim to have a better model, but I can help sort out something that makes sense for DC2.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LSSTDESC/DC2_Repo/issues/55#issuecomment-352500146, or mute the thread https://github.com/notifications/unsubscribe-auth/ACCEUl3qUvWPt7bsCfhJb4gu3SYFh1gjks5tBqOYgaJpZM4RC1vA .

-- Chris Fassnacht Professor, Physics Dept. UC Davis 1 Shields Ave. Davis, CA 95616 +1-530-554-2600

rmandelb commented 6 years ago

Could we invert the problem? That is, stick with data generated with one model, but then evaluate how well analysis codes that assume multiple models work? Will that teach us as much as the process you described? (or at least will it teach us enough for what we wanted to learn in DC2?)

cdfassnacht commented 6 years ago

Hi Rachel,

I'm not sure if your suggestion would address what worries me, since presumably the codes that did better at handling the data generated by the damped random walk model would be evaluated as "better". However, all of this may be moot for DC2. Maybe it is fine to get something that does well with damped random walk light curves now and then worry about whether that paints us into a corner later on.

Chris

On Mon, Dec 18, 2017 at 11:59 AM, Rachel Mandelbaum < notifications@github.com> wrote:

Could we invert the problem? That is, stick with data generated with one model, but then evaluate how well analysis codes that assume multiple models work? Will that teach us as much as the process you described? (or at least will it teach us enough for what we wanted to learn in DC2?)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LSSTDESC/DC2_Repo/issues/55#issuecomment-352541095, or mute the thread https://github.com/notifications/unsubscribe-auth/ACCEUmw8Hu6CX5b6g-7cQiGiYzL-U37yks5tBsQWgaJpZM4RC1vA .

-- Chris Fassnacht Professor, Physics Dept. UC Davis 1 Shields Ave. Davis, CA 95616 +1-530-554-2600

rmandelb commented 6 years ago

I'm not sure if your suggestion would address what worries me, since presumably the codes that did better at handling the data generated by the damped random walk model would be evaluated as "better".

That's true, they would be. But that's not quite my point. Consider two scenarios:

  1. The codes that model the light curves as damped random walk do drastically better than the codes that model the light curves in some other way that we think is at least reasonably well-motivated.

  2. The codes that model the light curves as damped random walk do only slightly better than the codes that model the light curves in some other way that we think is at least reasonably well-motivated.

In the first case, what we've learned is that our analyses depends sensitively on the match between the real and assumed AGN variability model, and so for DC3 we should spend some time/effort in getting the latest and greatest in the sims (and we also need to improve our modeling codes to avoid this sensitivity and marginalize over any unknown features in AGN variability). In the second case, what we've learned is that multiple models that we think are physically plausible do nearly as well as each other, so we're not too sensitive to the AGN variability model in the sims -> we can focus on other things, both in DC3 sims and in developing modeling codes.

I admit I may be missing something, as I'm not an expert, but I feel this could still be informative.

cdfassnacht commented 6 years ago

Hi Rachel,

I hadn't thought about it that way. What you say makes sense, so we probably don't have to worry about multiple generative models at this point.

Ciao,

Chris

On Mon, Dec 18, 2017 at 4:23 PM, Rachel Mandelbaum <notifications@github.com

wrote:

I'm not sure if your suggestion would address what worries me, since presumably the codes that did better at handling the data generated by the damped random walk model would be evaluated as "better".

That's true, they would be. But that's not quite my point. Consider two scenarios:

1.

The codes that model the light curves as damped random walk do drastically better than the codes that model the light curves in some other way that we think is at least reasonably well-motivated. 2.

The codes that model the light curves as damped random walk do only slightly better than the codes that model the light curves in some other way that we think is at least reasonably well-motivated.

In the first case, what we've learned is that our analyses depends sensitively on the match between the real and assumed AGN variability model, and so for DC3 we should spend some time/effort in getting the latest and greatest in the sims (and we also need to improve our modeling codes to avoid this sensitivity and marginalize over any unknown features in AGN variability). In the second case, what we've learned is that multiple models that we think are physically plausible do nearly as well as each other, so we're not too sensitive to the AGN variability model in the sims -> we can focus on other things, both in DC3 sims and in developing modeling codes.

I admit I may be missing something, as I'm not an expert, but I feel this could still be informative.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LSSTDESC/DC2_Repo/issues/55#issuecomment-352600421, or mute the thread https://github.com/notifications/unsubscribe-auth/ACCEUpwwElOtdxFedFvC7miRWE-Ue7Tdks5tBwHzgaJpZM4RC1vA .

-- Chris Fassnacht Professor, Physics Dept. UC Davis 1 Shields Ave. Davis, CA 95616 +1-530-554-2600

drphilmarshall commented 6 years ago

Happy New Year, all!

@danielsf, I guess the Simplest Possible Thing we could do for DC2 is just apply the MacLeod model to the DC2 AGN hosts. What galaxy properties (if any) would be needed to make these assignments? Average i-band magnitude? Redshift? None at all?

jchiang87 commented 6 years ago

Does the "MacLeod model" assume the emission seen by LSST is ultimately from the central BH accretion disk? If so, then to first order, the luminosity should scale as the accretion rate, and the rest frame variability time scales should scale as the central BH mass. If that model is physically motivated at all, then surely those are two of the most relevant parameters.

danielsf commented 6 years ago

The model was actually constructed before my time at UW, I just maintain it now, so I can't answer this question without doing some research.

The reference in question is ApJ 721, 1014, if anyone wants to read along with me. I will try to have something intelligent to say by Monday.

danielsf commented 6 years ago

The MacLeod et al. model is based on two parameters:

Their Equation (7) and Table 1 give expressions that allow us to solve for these parameters given the black hole mass, the absolute magnitude of the AGN in the i band, and the rest frame wavelength (so that each band [ugrizy] will have a different \tau and SF). Presumably, we will want to introduce some scatter about this relationship. I'm not 100% sure how to accomplish that. Table 1 gives 1-sigma errors on the various coefficients in Equation (7). We could just use random draws from those distributions for the actual coefficients in our code, so that \tau and SF_inf don't perfectly align with the Equaiton (7) expression.

Note: in the past I have said that the model as currently implemented in CatSim implements an independent random walk for each band. I was mistaken. There is a single random walk that is rescaled by the different SF_inf values for each band. Do we want to fix that before DC2? My reading of MacLeod et al. is that the six bands should, indeed, each be represented by an independent random walk.

danielsf commented 6 years ago

In order to apply the MacLeod et al. model to the AGN in protoDC2, we need to assign a characteristic timescale and a structure function at infinite time lag to the AGN. MacLeo et al. equation (7) and table 1 give an expression for these as a function of rest frame wavelength and absolute i-band magnitude. Unfortunately, the only parameters we have for our AGN are mass and accretion rate. I propose the following scheme for using these to get absolute i-band magnitude (rest frame wavelength is easy given the AGN's redshift):

The left panel of Figure 15 in MacLeod et al. shows that, if one had the mass of the blackhole (which we do) and the ratio of the AGN's luminosity to its Eddington luminosity, you could read off the absolute i-band magnitude. If we follow the expressions in this lecture

http://www-astro.physics.ox.ac.uk/~garret/teaching/lecture7-2012.pdf

we can approximate the AGN's luminosity as

L = epsilon accretion rate c^2

taking epsilon=0.1. The Eddington luminosity is just a function of the black hole mass and some physical constants. I will therefore use these parameters and MacLeod et al. Figure 15 to approximate the absolute i-band magnitude, and then use MacLeod et al. equation 7 to get the needed AGN variability parameters.

Any objections? (@drphilmarshall, @jchiang87 since y'all are actual AGN astronomers)

drphilmarshall commented 6 years ago

This sounds like it might work: I'm interested to see whether you can reproduce MacLeod's figures for various 2D distributions of AGN properties with this scheme applied to protoDC2. You'll need to include some scatter, to get a good match, I think. Certainly I'd want to see i-band magnitude, and the two DRW parameters, having distributions that matched the published ones. Good luck, Scott!

jchiang87 commented 6 years ago

@danielsf Yes, that proposal sounds perfectly reasonable.

cdfassnacht commented 6 years ago

Would you be introducing some scatter in that relationship (from the Cotter lecture) as well?

On Wed, Jan 17, 2018 at 5:55 PM, James Chiang notifications@github.com wrote:

@danielsf https://github.com/danielsf Yes, that proposal sounds perfectly reasonable.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LSSTDESC/DC2_Repo/issues/55#issuecomment-358512019, or mute the thread https://github.com/notifications/unsubscribe-auth/ACCEUuPtktECm3H23PbO3oJu-O7fbY8lks5tLqSBgaJpZM4RC1vA .

-- Chris Fassnacht Professor, Physics Dept. UC Davis 1 Shields Ave. Davis, CA 95616 +1-530-554-2600

danielsf commented 6 years ago

I don't know. I suspect I will just iterate over different fitting relationships with different amounts of scatter introduced at different steps until we come up with a set of plots analogous to those in MacLeod et al. that everyone believes.

danielsf commented 6 years ago

Actually, that's a good point. Given that M_i is our only handle on the quiescent magnitude of the AGN, we probably should introduce some scatter in the values of \epsilon so that there isn't a straight linear relationship between accretion rate and luminosity.

danielsf commented 6 years ago

@drphilmarshall @jchiang87 @cdfassnacht (I don't know who else in our collaboration actually study AGN)

I am about to summarize what happens if I naively use black hole mass and accretion rate to find the observed magnitude of AGNs in protoDC2. The executive summary is that we get a lot of AGNs with anomalously low black hole masses and anomalously high L/L_Eddington. We probably need to impose some cutoff in black hole mass. This needs to be informed by:

How many AGN do we expect in this 5-degree-by-5-degree field?

Here is the plot from MacLeod et al. 2010 that I am using to drive everything I am about to show you. This figure is based on 9,000 spectroscopically-confirmed quasars from SDSS stripe 82

screen shot 2018-01-18 at 9 36 31 am

Based on this, my plan was to

1) Convert the reported black hole accretion rates into an L/L_Eddington by taking L=0.1*(accretion rate)*c^2

2) Use a simple M_i = A*log10(Mbh) + B*log10(L/L_Eddington) + C fit the MacLeod et al. figure to then get absolute i-band magnitudes for the protoDC2 galaxies.

Here is a plot that shows how the M_i fitting relationship looks (this plot does not include any data; it is just showing how M_i works out on a grid of Mbh, L/L_Edd)

macleod_15_analogy

With M_i and the redshift of the galaxies, I can find the observed i-band magnitude m_i of the protoDC2 AGN. Here is the cumulative distribution of AGNs with respect to m_i (note the logarithmic vertical axis). The blue curve applies no mass cut. The green curve shows only sources with Mbh>=10^7 solar masses

obs_mag_distribution

If I take any AGN with m_i<=24.0 as 'observable', I get about 40,000 observable AGN. Plotting the densities of these AGN in various combinations of M_i, m_i, Mbh, and L/L_Eddington space, I get the following plots

actual_sources_24 0 observable_agn_24 0 observable_agn_obs_mag_24 0

As you can see, we get a lot of low mass (Mbh<10^7 Msun), high L/L_Eddington AGNs. If I demand that only galaxies with Mbh>=10^7 Msun have AGNs, the number of AGN goes down to about 15,000 AGN, but the distributions in parameter space look more sensible

actual_sources_24 0_mass_cut observable_agn_24 0_mass_cut observable_agn_obs_mag_24 0_mass_cut

Does 15,000 AGN seem reasonable? If it is too many, we can obviously impose more stringent cuts in parameter space on what constitutes an AGN. If it is too few, we could try imposing a cut on L/L_Eddington to trim out the anomalous sources.

What do people think?

jchiang87 commented 6 years ago

I think the plot of L/Ledd vs Mbh/Msun looks wrong. AGNs with Mbh/Msun ~ 1e7 are usually Seyfert galaxies and they typically have lower values of L/Ledd ~ 0.01, while AGNs with larger BH masses are expected to have higher values of L/Ledd, i.e., the sense of the correlation is the opposite from what has been observed. @evevkovacs How are the disk accretion rates determined for these objects?

jchiang87 commented 6 years ago

Sorry, I take back the assertion about the presence of any observed correlation in L/Ledd vs Mbh/Msun. Selection effects make that hard to nail down, but do I think it is true that we see more 10^8Msun AGNs with L/Ledd ~ 0.1 than these simulations would imply.

@danielsf If we were to use the existing CatSim model of AGNs, how many would we see in this 5x5 deg field? I'm curious what the total number and distribution of M_i's would look like compared to what we're seeing here.

danielsf commented 6 years ago

Here's an interesting result: the DC1 and DC2 AGN populations have similar total numbers of objects and absolute magnitude distributions (m_i), but very different absolute magnitude (M_i) and redshift distributions dc1_v_dc2_agn_m_i dc1_v_dc2_agn_abs_m_i dc1_v_dc2_agn_z

drphilmarshall commented 6 years ago

What happens if you now relax the M_BH constraint again? I guess the AGN you are missing at high z (relative to DC1) are also the bright absolute magnitude ones.

danielsf commented 6 years ago

Actually.... not dc1_v_dc2_agn_m_i_no_mass_cut dc1_v_dc2_agn_abs_m_i_no_mass_cut dc1_v_dc2_agn_z_no_mass_cut

*ignore the parenthesis in the legend on the z distribution; no mass cut has been applied to DC2 in these plots

The maximum value of redshift_true in the protoDC2 catalog is 0.998

>>> import GCRCatalogs
>>> cat=GCRCatalogs.load_catalog('proto-dc2_v2.1.2')
>>> vals = cat.get_quantities(['redshift_true', 'blackHoleMass', 'blackHoleAccretionRate'])
>>> vals['redshift_true'].max()
0.99801195
danielsf commented 6 years ago

It does seem like we are missing the brightest AGN from our simulation. MacLeod et al. say that the median M_i for their sample is -25.0. Even when I do not impose any cuts on mass or m_i, M_i=-25 is far into the tail of our distribution. Independent of the accretion rate question, there just aren't a lot of high mass black holes in this simulation.

@evevkovacs Where did the black hole masses and accretion rates in protoDC2 come from?

evevkovacs commented 6 years ago

Black hole masses and accretion rates in protoDC2 come from Galacticus. @abensonca can comment on the details of that model and how well it has agreed with the data in the past. But I think you need to be careful with the DC1 and protoDC2 comparisons. Are you imposing a redshift cut on DC1? protoDC2 only goes out to z~1. What is the z distribution of bright AGNs in DC1?

danielsf commented 6 years ago

The z-distributions of DC1 vs DC2 are plotted above. I did not impose any redshift cut in DC1. It is good to know that protoDC2 only goes out to z=1 by design.

Here are the DC1 vs DC2 comparisons with DC1 limited to z<=1

dc1_v_dc2_agn_m_i_z1 dc1_v_dc2_agn_abs_m_i_z1 dc1_v_dc2_agn_z_z1

I have again imposed the Mbh<=10**7 Msun limit on DC2. The distributions now have similar shapes, but there are 1/2 as many sources in DC1 as in DC2. I don't know how alarmed we should be by that.

We should probably not expect the distributions of AGN parameters to reproduce the MacLeod et al. results terribly well, then. They have a significant number of AGN at z>=2 (see their Figure 12 reproduced below).

screen shot 2018-01-23 at 1 36 18 pm

I will forge ahead with implementing the AGN infrastructure now and we can revisit the question of validating the distribution of AGN parameters when we have the full DC2 catalog in hand.

Does that make sense?

jchiang87 commented 6 years ago

Sounds ok to me, but I'd like some clarity on what will be in the Run1.1 data: The Mbh <= 10**7 Msun cut will be applied, resulting in about twice as many AGNs (per sq deg) with m_i < 24.0 as in DC1.

Will there also be many more faint AGNs with m_i > 24.0?

Also, AGN SEDs are a lot bluer than stars so the "observerable" number will be relatively higher in the bluer bands. Should we also be concerned about what fraction of galaxies will be hosting AGNs that may affect the measured properties of those galaxies.

abensonca commented 6 years ago

Accretion rates are determined from a very simple model of Bondi accretion. The physical details of the model I don't consider to be very accurate - more importantly is that the model can be calibrated to reproduce observables. Right now, it's only moderately successful at matching those, so it needs improvement.

Masses are found by simply integrating the accretion rates over time (modulo radiative and jet losses), along with the assumption that SMBHs merge instantaneously when their host galaxies merge.

The model does attempt to distinguish different accretion modes (radiative efficient vs. inefficient) but we currently don't output any details of this (it's a simple model based on accretion rate relative to Eddington so is easy to reconstruct though).

I think the important thing here is to identify what validation tests we'd like the AGN to pass. Then we can work on tuning the model to achieve those.

danielsf commented 6 years ago

@jchiang87

My previous plots had been imposing a hard cut at m_i<=24.0. If I only keep the mass cut, it looks like, at fainter magnitudes, protoDC2 has between 3 and 4 times as many AGN as DC1.

This plot shows the normalized distribution of AGN as a function of m_i. The text lists how many sources are in each catalog at different m_i cutoffs. Ignore the "M_i<=4000.0" at the top. I had given myself the option of imposing a cut in absolute magnitude, but no M_i cut seems to alleviate the factor of 4 excess of AGN at faint magnitudes.

different_m_i_dist

danielsf commented 6 years ago

I know I said I was going to forsake validation until we had the full DC2 in hand. I lied.

The following plots apply a Mbh>=10**7 Msun cut to the protoDC2 AGN and a z<=1.0 cut to the DC1 AGN. I then divide the samples into bins of observed i-band magnitude and compare the distributions of the AGN parameters (tau, and the structure function in all six bands)

agn_tau_dist agn_sf_dist_22 0_21 0 agn_sf_dist_23 0_22 0 agn_sf_dist_25 0_23 0 agn_sf_dist_27 0_25 0

The structure function distributions look similar, except in the brightest bin. The tau distributions have the right mean, but the widths leave something to be desired. This could be due to the lack of intrinsically bright (low absolute i-band magnitude) AGN in the protoDC2 sample.

In all but the brightest bin, protoDC2 has 3 times as many sources as DC1 (number of sources are tallied at the top of each set of structure function distributions)

evevkovacs commented 6 years ago

@danielsf @jchiang @yymao I think we need to capture this work into a validation test. AGN properties have not appeared on our list yet, probably because we didn't ask the right people. A couple of questions: 1) You are comparing with DC1. Does that mean that DC1 was tuned to have acceptable AGN properties and if so, what was used to tune it? 2) Who are the right people to loop in to set AGN validation criteria?

danielsf commented 6 years ago

I do not know how DC1 AGN were assigned. @SimonKrughoff did that before I was a part of the project (so... 2010-ish). I suspect that the answer to "how was DC1 tuned?" has been lost to time.

I am comparing protoDC2 to DC1 because we seem to have been happy with DC1 in the past, and I have no other catalog that I can down-select to z<=1.

I do not know who we should loop in to validate our AGN parameter distributions. I always just wander down the hall and ask Zeljko when I have an AGN question.

jchiang87 commented 6 years ago

Who are the right people to loop in to set AGN validation criteria?

For DESC purposes, we should include the SL and Twinkles groups since the AGN properties affect them most directly. We probably should have someone else (e.g., Zeljko) assess the AGN properties purely from an AGN science perspective. It might be worth reaching out to the LSST AGN group for advice (there's a slack channel #agn). For DC1 and the default catsim data more generally, I had assumed that the AGN properties were constrained by papers that considered the observations like the Macleod et al. work.

danielsf commented 6 years ago

Looking back at some of the old simulation validation papers I have stuffed away in a folder, the distribution of AGNs in DC1 may be based on this paper

A. Bongiorno, A. Merloni, M. Brusa, B. Magnelli, M. Salvato, M. Mignoli, G. Zamorani, F. Fiore, D. Rosario, V. Mainieri, H. Hao, A. Comastri, C. Vignali, I. Balestra, S. Bardelli, S. Berta, F. Civano, P. Kampczyk, E. Le Floc’h, E. Lusso, D. Lutz, L. Pozzetti, F. Pozzi, L. Riguccini, F. Shankar, and J. Silverman. Accreting supermassive black holes in the COSMOS field and the connection to their host galaxies. Monthly Notices of the Royal Astronomical Society, 427:3103–3133, December 2012. doi: 10.1111/j.1365-2966.2012.22089.x.

Though I think this just addressed the distribution of AGNs and their colors/brightnesses. It does not address variability parameters

ivezic commented 6 years ago

I've been kibitzing this thread for a while but am not sure any more if AGN and quasar means the same thing to you. Bongiorno et al. talk about quasars, as well as MacLeod at al. Quasars should be more luminous than M_B ~ -23 and mostly unresolved. Which population are you after?

yymao commented 6 years ago

Just to clarify (and sorry for my ignorance if this is something obvious) --- at the extragalactic catalog level, when we say AGNs here we actually just mean black hole masses and accretion rates, right?

Should we then test the extragalactic catalog to see if it has some reasonable BH mass -- stellar mass relation or BH accretion rate -- stellar mass relation? Or are there some other more relevant tests?

jchiang87 commented 6 years ago

I think we are ultimately after all accreting supermassive black holes that would produce emission observable in LSST bands. There seems to be an underlying assumption that this is all disk emission (possibly reprocessed by BLR material, etc), and does not include jet emission. At least that's what I conclude from the Macleod et al. model. A quick scan of Bongiorno et al. (e.g., the SED, the types of objects they are considering) also indicates primarily disk-based emission. Using the terminology strictly, I've always understood quasars to be radio loud, and QSOs to be their radio quiet cousins. Both generally are more luminous than lower BH mass galaxies like Seyfert galaxies, which are considered by Macleod et al. and Bongiorno et al., in addition to more luminous objects. In any case, AGN has always been the catch-all for accreting supermassive BH systems.

evevkovacs commented 6 years ago

@yymao Yes, that's correct: there will need to be some tests of the black hole masses and accretion rates. If I understand correctly, one can invert the MacLeod et al results and convert them into constraints on the distributions of BH mass and accretion rates.

SimonKrughoff commented 6 years ago

@danielsf To muddy the waters even more, I'm pretty sure the AGN were distributed in the catalog by Rob Gibson (who no longer works for the project). I seem to remember a paper that provided the redshift distribution (maybe by Luis Ho). I think @ivezic will remember.

All I did was implement the variability model based on the parameters Rob provided.

aphearin commented 6 years ago

The v4 release of protoDC2 has very significant changes to the modeling of black hole mass and accretion rate. @danielsf pointed me to this Issue so that everyone who will be doing scientific analysis impacted by these changes will be aware of the change.

In v3 and before, we were using the native Galacticus model for M_bh and dM_bh/dt. We are now using empirical methods for both quantities.

  1. For black hole mass, we use a log-normal distribution centered at the power law taken from Kormendy & Ho (2013). Differences between v3 and v4 are shown in the plot below (v4 will look exactly like Kormendy & Ho (2013), by construction):

black_hole_mass_vs_bulge_mass_v3

  1. For black hole accretion rate, we are using an empirical model built upon Aird, Coli, et al. 2017. The native quantity in these measurements is the Eddington ratio, which will now be included as a column as requested by @danielsf . Summary plots appear below:

The first plot shows the mass-dependence of the accretion rate at z=0 in v3:

black_hole_accretion_rates_v3

All other plots below show scaling relations for the v4 empirical model:

black_hole_accretion_rates_v4_mstar_dependence black_hole_accretion_rates_v4_redshift_evolution black_hole_accretion_rates_v4_sfr_dependence black_hole_eddington_ratios_redshift_evolution

CC @drphilmarshall @cdfassnacht @jchiang87

danielsf commented 6 years ago

As discussed on Slack starting here

https://lsstc.slack.com/archives/C77DDKZHR/p1527991352000121

my somewhat naive application of the fitting formulae from MacLeod et al has led to an AGN that varies from a mean magnitude of about 28 all the way up to magnitude 4.5. Probably we just need to impose a hard cut on the maximum allowable value of the structure function for the random walk driving AGN variability. Does 4 magnitudes seem reasonable? Summarizing from Slack, the mean, median, and standard deviations of the structure functions produced by the current code are

# band mean median stdev
u 1.317624e+00 1.183000e+00 6.879020e-01
g 1.159129e+00 1.041000e+00 6.057803e-01
r 1.024631e+00 9.198000e-01 5.369232e-01
i 9.325369e-01 8.375000e-01 4.874273e-01
z 8.717454e-01 7.825000e-01 4.563290e-01
y 8.265397e-01 7.417000e-01 4.336567e-01
drphilmarshall commented 6 years ago

How many sigma away from the relevant band mean structure function was this object? Given that we are only modeling the distribution approximately, seems something like a 3-sigma clip could be justified?

danielsf commented 6 years ago

The offensive object had structure functions on the order of 10 to 20, so: 10 to 20 sigma

drphilmarshall commented 6 years ago

That is quite the random draw. We might think of permitting 5-sigma outliers, just to make things interesting!

danielsf commented 6 years ago

It wasn't a random draw that caused the structure functions to be so large. The phenomenological fit that we are applying is only valid in a certain regime of parameter space and we applied it outside that regime. More specifically, the structure function is calculated by this code

    AA = -0.56
    BB = -0.479
    CC = 0.111
    DD = 0.11

    if rng is not None:
        if isinstance(redshift, numbers.Number):
            n_obj = 1
        else:
            n_obj = len(redshift)
        AA += rng.normal(0.0, 0.01, size=n_obj)
        BB += rng.normal(0.0, 0.005, size=n_obj)
        CC += rng.normal(0.0, 0.005, size=n_obj)
        DD += rng.normal(0.0, 0.02, size=n_obj)

    eff_wavelen_rest = eff_wavelen/(1.0+redshift)

    log_sf = AA + BB*np.log10(eff_wavelen_rest/4000.0)
    log_sf += CC*(M_i+23.0) + DD*(np.log10(mbh)-9.0)

The offending AGN had an M_i (absolute i band magnitude) of about -3, which was low enough log_sf became very positive.

evevkovacs commented 6 years ago

@danielsf So a readiness test on the instance catalog would have caught this issue. Sorry, I haven't kept track, but since we have an instance catalog reader and the DESCQA readiness test is quite generic, I think it could be adapted fairly easily for these kinds of checks.

danielsf commented 6 years ago

I have run the script assigning AGN magnitudes and variability parameters to 0.1 version of cosmoDC2. Naively doing what we did for run 1.2 (only accepting galaxies with black hole masses > 10**7 solar masses; throwing out AGN with apparent i-band magnitudes greater than 30; limiting structure functions to 4 magnitudes), I get the following distributions:

1-D distribtions of tau (timescale of variation in days) and i-band structure function. Different colors correspond to different cuts in observed r-band magnitude

r_mag_distribution

2-D densities of observed r-band magnitude versus redshift, black hole mass, and Eddington ratio

obs_r_densities

2-D densities of absolute i-band magnitude versus redshift black hole mass and Eddington ratio

abs_i_densities

The distribution of tau and SF_i found in the MacLeod et al reference that our model is based on looks like this:

screen shot 2018-02-02 at 11 32 58 am

This is taken from SDSS. The white histogram is all sources out to observed r-band of 22.5; the filled histogram is sources with absolute i-band magnitude between -26 and -27. MacLeod et al considered 290 square degrees and found 9,275 spectroscopically confirmed AGN. Obviously there are orders of magnitude more than that in cosmoDC2. I do not know if that is because of selection effects in MacLeod et al, or if we need to thin out our simulation.

How do AGN experts feel about these distributions?

aphearin commented 6 years ago

For reference, the underlying black hole model that generates this mock catalog has black hole masses determined by the M_bulge-M_bh power law taken from Kormendy & Ho 2013, and AGN accretion rate distributions taken from Aird, Coil & Georgakakis (2017).

katrinheitmann commented 5 years ago

@danielsf @jchiang87 Obviously we did make a final decision in DC2 :-) Two questions: (i) Is the write-up in the current DC2 paper draft (https://www.overleaf.com/8574729384jttdkyjcbgyq) sufficient to capture what was done? (ii) Is something different needed for the future? If so, we should capture that somewhere. I think then we can close this issue? Thanks!