HeloiseS / hoki

Bridging the gap between observation and theory
https://heloises.github.io/hoki/intro.html
BSD 3-Clause "New" or "Revised" License
47 stars 8 forks source link

[BPASS] Generate isochrone from BPASS #81

Closed pzyao closed 2 years ago

pzyao commented 2 years ago

Hi,

I am currently working on a project that needs to simulate a stellar population by assuming a star formation history/rate. I have been doing this with the fsps code where we can generate an isochrone containing the luminosity, effective temperature, and weights of stars with different masses at each age, then I take those and sample based on the desired sfr. However, unlike for MIST isochrones, fsps cannot do this for BPASS, since individual stars are unavailable here. I was wondering if something along this line could be done with Hoki.

The end goal is to obtain a composite stellar population with H-R diagrams and CMDs. All we will need is a list of stars at each age of the population, with weights, luminosity, and effective temperature included (or magnitudes at different bands for CMD). It would be quite helpful if assuming an IMF is possible, e.g. Salpeter.

I have noticed the hr_diagram.stack function in Hoki, and was wondering if it is possible to assign weights to each age bin when stacking, and if a similar function is available in hoki.cmd to achieve a similar purpose. For this purpose, I was also wondering if there can be a color bar that indicates the approximate value of each square in the HR diagram/CMD Hoki makes, or a method to extract each value of the squares.

Thanks in advance!

HeloiseS commented 2 years ago

hello!

So at the moment there is no way to do these mass cuts on HR Diagrams or on CMDs in hoki so I made you a custom class (let's call this a hoki-hack hahaha) so you can make HRDiagrams with custom mass cuts! Note: it's a bit slow, so I strongly recommend you run it to make your grids with your chosen mass cuts and then you save those grids so you don't have to run them again!

Let me know how your get one :)

Summary of the code you'll be able to run

Creating the HRD

myhrd =  ManualHRDMaker()
myhrd.make(M_lo = lowmasscutoff, M_hi = highmasscutoff)

Plotting:

myhrd.plot(log_age=6.8) #or whatever log age between 6 and 11

:warning: NOTE :warning: When you make mass cuts you'll end up with parts of the HRD with no weights at all, in this case you'll get the following error: ValueError: min() arg is an empty sequence. This is something I expect I just haven't created a nice error message for it since it's not core hoki.

Retrieving the grid of weights:

myhrd.grid

The shape is (51,100,100) and the dimensions are: age, L, T

Now say you want to sample weights for luminosities between log(L)=5 and 6 at an age of 5 Myrs, you need to know which of the 100 indices fit in your bin. That's easy! You can access the [time, luminosity, temperature] coordinate system with myhrd.t, myhrd.L_coord and myhrd.T_coord, so what you would do is:

age_mask = (hrd.t==6.7) #that's the log10 of 5Myrs
L_mask = (hrd.L_coord>5) &(hrd.L_coord<6)
hrd.grid[age_mask, L_mask, :] # assuming you want all the temperatures

Saving your data

As you can see you need the grid and the coordinate system to make cuts and crop your data to your area of interest. You could save your myhrd object as a pickle file but if you are going to have a lot of different [M_lo, M_hi] cuts I wouldn't recommend it - you'd be saving to disk a tonne of extraneous data.

Instad I would save the 3D grids myhrd.grid of different mass cuts into .npy files because it'll make it super easy to access with numpy. As for the coordinate system, you can save them in a separate file to be called anytime you want to make a new mask.

Things you need to run the new code

The import statements

You're going to need hoki, which I assume you already have. Then you need a few utility imports to support the new class I've made for you. So long as you have hoki, everything else is standard python.

from hoki import load
import matplotlib.pyplot as plt
from hoki.constants import *
import numpy as np
import matplotlib.cm as cm
from hoki.utils.exceptions import HokiFatalError, HokiUserWarning, HokiFormatError, HokiKeyError
from hoki.utils.hoki_object import HokiObject
import os

The data

The other thing you're going to need is the full set of stellar models - the OG ones that come straight out of the FORTRAN. In case you don't have those yet here is how you get them and set it up to work with your new code:

The new code: ManualHRDMaker

Now here is the new class. You should be able to paste that at the beginning of a jupyter notebook (after your import statements) or any script that you are making and then you can run the code that I gave you in the first section. Like I said - this is slow, it's a hacked version of the CMD class (not the HRD class, for reasons I won't go into unless you're keen to know).

class ManualHRDMaker(HokiObject):
    """
    *acoustic* HRDiagram maker for BPASS
    """
    # NOTE: dummy is the name of the big array returned by the BPASS models
    # in the hoki code I use it as a "proper noun" - not a random variable name

    # dummy_col_number=len(dummy_dict) I think this line is no longer useful

    def __init__(self, file,
                 bpass_version=DEFAULT_BPASS_VERSION,
                 models_path=MODELS_PATH):
        """
         Initialisation of the Colour Magnitude Diagram object

         Parameters
         ----------
         file : str
             Location of the file containing the model inputs
         """

        self.bpass_input = load.model_input(file)
        self._file_does_not_exist = []
        self.dummy_dict=dummy_dicts[bpass_version]

        # Setting up the grid'dc resolution
        self.T_coord = np.arange(0.1, 10.1, 0.1)
        self.L_coord = np.arange(-2.9, 7.1, 0.1)

        self.grid = None
        self.path = models_path
        self._my_data = None
        self._time_bins = None
        self._log_ages = None
        self._ages = None

    def make(self, M_lo=0, M_hi=1000):
        """
        Make the HRD - a.k.a fill the grid

        Notes
        ------
            - This may take a few seconds to a minute to run.

        Returns
        -------
        None
        """
        self.M_lo, self.M_hi = M_lo, M_hi

        # add another dimension for mass?
        self.grid = np.zeros((len(BPASS_TIME_BINS), len(self.L_coord), len(self.T_coord)))

        assert os.path.isdir(MODELS_PATH), f"DEBUGGING ASSISTANT: Directory MODELS_PATH = {MODELS_PATH} not found.\n " \
                                           f"You can change MODELS_PATH by hoki.constants.set_models_path([MYPATH])." \
                                           f"If you've just done that and it looks like it didn't work, try to restart " \
                                           f"your notebook or terminal ;). "

        self.path = MODELS_PATH

        # FIND THE KEYS TO THE COLUMNS OF INTEREST IN DUMMY

        col_keys = ['timestep', 'age', 'log(T1)', 'M1', 'log(R1)', 'log(L1)']

        try:
            cols = tuple([self.dummy_dict[key] for key in col_keys])
        except KeyError as e:
            err_m='Python said: '+str(e)+'\nDEBUGGING ASSISTANT: \nOne or both of the chosen filters do not correspond ' \
                                         'to a valid filter key. Here is a list of valid filters - ' \
                                         'input them as string:\n'+str(list(self.dummy_dict.keys())[49:-23])
            raise HokiKeyError(err_m)

        # LOOPING OVER EACH LINE IN THE INPUT FILE
        for filename,  model_imf, mixed_imf, mixed_age, model_type in zip(self.bpass_input.filenames,
                                                                          self.bpass_input.model_imf,
                                                                          self.bpass_input.mixed_imf,
                                                                          self.bpass_input.mixed_age,
                                                                          self.bpass_input.types):

            # PRE PROCESSING THE MODEL INPUTS

            # The model_imf and mixed_imf have different precisions because of the FORTRAN code
            # model_imf is double precision but mixed_imf is double precision. We round to take care of that.
            model_imf = round(model_imf, 6)
            mixed_imf = round(mixed_imf, 6)

            # some mixed ages come out negative - it should be taken into account by setting them to 0
            if mixed_age == np.inf:
                mixed_age = 0

            # LOADING THE DATA FILE
            # Making sure it exists - If not keep the name in a list
            try:
                self._my_data = load.dummy_to_dataframe(self.path + filename)
            except (FileNotFoundError, OSError):
                self._file_does_not_exist.append(filename)
                continue

            L = self._my_data['log(L1)']
            T = self._my_data['log(T1)']
            # LIST WHICH BINS IN THE GRID EACH COLOUR AND MAGNITUDE BELONGS TO
            self._T_bins = [np.abs((self.T_coord - _T)).argmin()
                              if self.T_coord[np.abs((self.T_coord - _T)).argmin()] <= _T
                              else np.abs((self.T_coord - _T)).argmin() - 1
                              for _T in T]

            self._L_bins = [np.abs((self.L_coord - _L)).argmin()
                              if self.L_coord[np.abs((self.L_coord - _L)).argmin()] <= _L
                              else np.abs((self.L_coord - _L)).argmin() - 1
                              for _L in L]

            # MIXED AGE = 0.0 OR NAN CASE (i.e. no rejuvination)
            if np.isnan(mixed_age) or float(mixed_age) == 0.0:
                self._ages = self._my_data['age']
                # first line is always zero and will mess up the log so we take care of that
                self._log_ages = np.concatenate((np.array([0]), np.log10(self._my_data['age'][1:])))
                self._log_ages = [age if age >= 6.0 else 6.0 for age in self._log_ages]
                self._fill_grid_with(model_imf, model_type)

            # MIXED AGE NON ZERO CASE (i.e. rejuvination has occured)
            else:
                # MODEL IMF = MIXED IMF (These models only occur after rejuvination)
                if np.isclose(model_imf,mixed_imf, atol=1e-05):
                    self._ages = self._my_data['age'] + mixed_age
                    self._log_ages = np.log10(self._my_data['age'] + mixed_age)
                    self._fill_grid_with(mixed_imf, model_type)

                #  MODEL INF != MIXED IMF (These can occur with or without rejuvination)
                else:
                    # NON REJUVINATED MODELS
                    self._ages = self._my_data['age']
                    self._log_ages = np.concatenate((np.array([0]), np.log10(self._my_data['age'][1:])))
                    self._log_ages = [age if age >= 6.0 else 6.0 for age in self._log_ages]
                    self._fill_grid_with(model_imf-mixed_imf, model_type)

                    # REJUVINATED MODELS
                    self._ages = self._my_data['age'] + mixed_age
                    self._log_ages = np.log10(self._my_data['age'] + mixed_age)
                    self._fill_grid_with(mixed_imf, model_type)

    def _fill_grid_with(self, imf, model_type):

        for i, M, R, L in zip(range(len(self._ages)), 
                              self._my_data['M1'], 
                              self._my_data['log(R1)'], 
                              self._my_data['log(L1)']):
            if M<self.M_lo or M>self.M_hi:
                continue
            # Weird stuff happens to the primary models when they because WD and they can get counted twice
            # Here we helpout BPASS a tad by checking when a primary (type 1) has got the gravity, mass and luminosity
            # of a WD and we jsut remove it from this round so they don't linger on the MS.
            if round(model_type, 1) == 1:
                log_g = np.log10( (6.67259*10**(-8)) * (1.989*10**33) * M /
                                  (((10**R) *6.9598*(10**10))**2) )

                try:
                    if log_g > 6.9 and L < -1 and M < 1.5:
                        # In this case our primary has become a white dwarf and
                        # we need to take it out of the simulations
                        return
                except ValueError as e:
                    print(e)
                    return

            # NEED SPECIAL CASES FOR i = 0
            if i == 0:
                # First line isn't really a bin
                # self.grid[0, self._mag_bins[0], self._col_bins[0]] += imf * self._ages[0]
                continue

            try:
                N_i_m1 = np.abs(BPASS_TIME_BINS - self._log_ages[i-1]).argmin()
                N_i = np.abs(BPASS_TIME_BINS - self._log_ages[i]).argmin()
            except IndexError:
                print("This should not happen")

            # If the time step within one time bin
            if N_i_m1 == N_i:
                dt_i = self._ages[i] - self._ages[i-1]
                # Some time bins can be negative in BPASS and should be ignored
                if dt_i <0: continue

                self.grid[N_i, self._L_bins[i], self._T_bins[i]] += imf * dt_i

            # If the time step spans multiple time bins
            else:
                N_list = np.arange(N_i_m1, N_i+1)

                # First bin
                weight = 10**(BPASS_TIME_BINS[N_list[0]]+0.05) - self._ages[i-1]
                # Negative time bins should be ignored
                if weight < 0: continue
                self.grid[N_list[0], self._L_bins[i], self._T_bins[i]] += imf * weight

                # Last bin
                weight = self._ages[i] - 10**(BPASS_TIME_BINS[N_list[-1]]-0.05)
                # Negative time bins should be ignored.
                if weight <0: continue
                self.grid[N_list[-1], self._L_bins[i], self._T_bins[i]] += imf * weight

                # Bins in between, if any
                if len(N_list)>2:
                    for N in N_list[1:-1]:
                        weight = BPASS_TIME_INTERVALS[N]
                        self.grid[N, self._L_bins[i], self._T_bins[i]] += imf * weight

    def plot(self, log_age=6.8, loc=111, cmap='Greys', **kwargs):
        """
        Plots the CMD grid at a particular age

        Parameters
        ----------
        log_age : float
            Must be a valid BPASS time bin
        loc : 3 integers, optional
            Location of the subplot. Default is 111.
        cmap : str, optional
            Colour map for the contours. Default is 'Greys'
         **kwargs : matplotlib keyword arguments, optional

        Returns
        -------
        matplotlib.axes._subplots.AxesSubplot :
            The plot created is returned, so you can add stuff to it, like text or extra data.

        """
        cm_diagram = plt.subplot(loc)

        #  THIS IS VERY SIMILAR TO THE PLOTTING FUNCTION IN HOKI.HRDIAGRAMS.

        #  Now we define our default levels
        index = np.where(np.round(BPASS_TIME_BINS,1) == log_age)[0]

        if log_age < 6.0 or log_age > 11.0:
            raise HokiFatalError("Valid values of log age should be between 6.0 and 11.1 (inclusive)")

        single_cmd_grid = self.grid[int(index)]

        # This step has been approved by JJ :o)
        infinities = np.where(single_cmd_grid == np.inf)
        for i in infinities: single_cmd_grid[i] = 0.0

        np.nan_to_num(single_cmd_grid, copy=False)

        single_cmd_grid[single_cmd_grid == 0] = min(single_cmd_grid[single_cmd_grid != 0]) - \
                                                0.1*min(single_cmd_grid[single_cmd_grid != 0])

        top_level = single_cmd_grid.max()
        min_level = single_cmd_grid.min()

        # we want our levels to be fractions of 10 of our maximum value
        # and yes it didn't need to be written this way, but isn't it gorgeous?
        possible_levels = [top_level*0.0000000000001,
                           top_level*0.000000000001,
                           top_level*0.00000000001,
                           top_level*0.0000000001,
                           top_level*0.000000001,
                           top_level*0.00000001,
                           top_level*0.0000001,
                           top_level*0.000001,
                           top_level*0.00001,
                           top_level*0.0001,
                           top_level*0.001,
                           top_level*0.01,
                           top_level*0.1,
                           top_level]

        # to make sure the colourmap is sensible we want to ensure the minimum level == minimum value
        levels = [min_level] + [level for level in possible_levels if level > min_level]

        colMap = cm.get_cmap(cmap)

        colMap.set_under(color='white')

        cm_diagram.contourf(self.T_coord, self.L_coord, np.log10(single_cmd_grid), np.log10(levels).tolist(),
                            cmap=cmap, **kwargs)

        cm_diagram.invert_xaxis()

        cm_diagram.set_ylabel('log L')
        cm_diagram.set_xlabel('log T')

        return cm_diagram

    def at_log_age(self, log_age):
        """
        Returns the HR diagrams at a specific age.

        Parameters
        ----------
        log_age : int or float
            The log(age) of choice.

        Returns
        -------
        The CMD grid : np.ndarray 240x100

        """

        if log_age < 6.0 or log_age > 11.0:
            raise HokiFatalError("Valid values of log age should be between 6.0 and 11.1 (inclusive)")

        bin_i = int(np.round(10*(log_age-6)))
        return self.grid[bin_i]

    def __getitem__(self, item):
        return self.grid[item]
pzyao commented 2 years ago

This looks awesome! Thanks so much for the "hoki-hack". I will start working with it right away, and will let you know if any more question pops up along the way.

pzyao commented 2 years ago

Hi,

I have been using this custom code with Hoki CMD for a little bit and it is functioning pretty well. But I have some additional questions related to this matter.

  1. First, what does the weights mean in the age x L x T grid? I assumed it is the linear relative probability among all elements of the grid (i.e. Prob = weight/sum(weight)).
  2. Second, I noticed that there is weighting among age bins, so is this assuming a constant SFR? If so, are the weights relative to the whole log(age)=6-11 populations combined, or just one single age?
  3. I noticed that I can increase the resolution of the grid and I just want to make sure this won't interfere with anything else:
        # Setting up the grid'dc resolution
        self.T_coord = np.arange(0.1, 10.1, 0.01)
        self.L_coord = np.arange(-2.9, 7.1, 0.01)
  4. If I want to make plots in color-magnitude space, would you just assume black body and calculate the magnitude in each wavelength based on Teff and L? Or have you written a built-in function to make CMD plots instead of HR-diagrams?

Thanks again!!!

HeloiseS commented 2 years ago
  1. So the weights is the IMF weighting it's taken from the modelimf weight of each model which is the number of each system you'd expect in a single 1 million solar mass population and yes it's normalised for each 2D grid (not across all ages) - it's the thign that gives the grey scale color map.

  2. The SFH assumed is a single star burst the weighting by age is because the bins are logarithmic and so if you don't weigh them you have more stars in the bigger bins that's it.

  3. I would not recommend touching the boundaries because they are the bvoundaries covered by our models, if you want to change 0.01 to 0.001 or something else knock yourself out and tell me what you get :D Although note that if you increase the resolution too much you wont' necessarily have anythign interesting to see since there's 'only' 250,000 models total

  4. If you want to make CMDs you can use the CMD stuff out of the box in hoki! we have dozens of filters pre-calculated in the stellar models :D

HeloiseS commented 2 years ago

@pzyao just checking in - how did this work out? can we close the issue?

pzyao commented 2 years ago

Yes, it worked out perfectly. Thanks so much!

But I have a side question: is it possible to obtain the collective appearance of both the primary and secondary star in a binary system? Or is it possible to know which primary star corresponds to a secondary star in the same system (like a unique system identification number or something…) ? So far it seems I am just looking at the temperature and luminosity of the primary (L1) or secondary (if I switch to L2)

Thanks in advance!

On Thu, Oct 6, 2022 at 04:34 Heloise @.***> wrote:

@pzyao https://github.com/pzyao just checking in - how did this work out? can we close the issue?

— Reply to this email directly, view it on GitHub https://github.com/HeloiseS/hoki/issues/81#issuecomment-1269586773, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYRVN65PCFUV6DKKV2CKBRLWB2FLDANCNFSM5XSMZVZQ . You are receiving this because you were mentioned.Message ID: @.***>

HeloiseS commented 2 years ago

At the moment there isn't. For the most part it shouldn't be an issue because most often the primary is so bright it will outshine the secondary. If you want to combine them it istricky to get an "observed" combined L and T because realistically you'd need to combine the spectra from stellar atmospheres (because the stars are not perfect blackbodies) and then estimate what an observer would take that as but as the recent Arcavi paper showed hot stars have large uncertainties on their temperatures.