danielapai / bioverse

A simulation framework to assess the statistical power of future biosignature surveys
MIT License
7 stars 5 forks source link

giant planets for updated occurrence rates module #19

Open matiscke opened 1 year ago

matiscke commented 1 year ago

The occurrence rates in Bergsten et al. 2022 don't include giant planets, but Bioverse should produce giant planets, too. Before making the Bergsten-et-al planet creator function the default, we should have it produce giant planets based on updated occurrence rates in the literature.

matiscke commented 11 months ago

This can be implemented later if needed. AFAIK, this functionality is not relevant for any current projects.

kevinkhu commented 11 months ago

I want to test if outer giant planets suppress inner small planet formation, so I'll need giant planet occurrence rates. I am open to suggestions for giant occurrence rates to add, and how best to implement this alongside the Bergsten et al. 2022 occurrence rates.

gbergsten commented 10 months ago

bergsten2022_Fig9 The create_planets_bergsten() function was only fit up to 3.5 Earth radii. It technically used a power law with zero slope (in log radius) to determine the period-marginalized radius distribution, shown in the attached Figure 9 from Bergsten+2022. There is a period-dependent factor determining the fraction of that marginal occurrence to be used at a given period, but for outer planets (beyond 100 days) it is roughly constant.

So left as is, create_planets_bergsten() would assume a flat radius distribution for anything above 3.5 Re and beyond 100 days, which may or may not be realistic. In Kunimoto & Matthews (2020), their Figure 9 shows a somewhat-flat distribution for planets above 4 Re between 50-400 days, such that the current create_planets_bergsten() may be appropriate to a factor of a few.

I'm not aware of parametric models for (outer) giant planet occurrence that are similarly comprehensive or up-to-date, so nothing immediately comes to mind for other functions to supplement Bergsten+2022...

gbergsten commented 9 months ago

While I still don't know of any recent (up-to-date, comprehensive) parameterizations for the giant planet occurrence distributions, there have been lots of grid-based approaches that I trust. "Grid-based" here means having a grid of period and radius bins, with an occurrence rate in each grid cell. It would be pretty straightforward to implement these in Bioverse (at least in my head) -- there are three steps that work very similarly to how Bioverse usually generates planets:

  1. Provide Bioverse with a 2D array that has the occurrence rate in each period/radius bin.
    • This type of 2D occurrence grid is what Bioverse generates from any parametric occurrence function, but usually at higher resolution (e.g., 1000 bins on either axis, instead of the ~10 that grid-based studies often use).
  2. When assigning planets to a system, determine the period/radius bin that will be used in proportion to the bin's occurrence (dN).
    • This is also what happens with parametric functions, just at a much higher resolution (so the bins are essentially points).
  3. To determine a specific planet's period & radius, "smooth" the values using a uniform distribution (a) centered on the center of the bin and (b) spanning the width of the bin.
    • Bioverse already does exactly this to prevent aliasing on the high-resolution grid values. We can apply exactly the same method while sampling over a larger bin, and most grid-based occurrence studies assume uniform distributions within a given bin anyways.

These steps are already there in the code and just need a little re-flavoring. In fact, adopting results from a grid-based study actually saves you a step, since you don't need to calculate the occurrence grid from a parametric function to begin with. In terms of which grids to use, I would recommend (in approximate decreasing order of up-to-date-ness): Datillo et al. (2023), Kunimoto & Matthews (2020), or Hsu et al. (2019).

matiscke commented 9 months ago

I like this idea. To me the open questions are:

  1. which occurrence rate source to use
  2. how to extrapolate beyond the constrained parameter range
  3. are there potential efficiency issues? For the hypothesis tests, the planet generation step is repeated many times.