tomasstolker / species

Toolkit for atmospheric characterization of directly imaged exoplanets
https://species.readthedocs.io
MIT License
22 stars 10 forks source link

Calculating the reddened bolometric luminosity #105

Open gabrielastro opened 2 weeks ago

gabrielastro commented 2 weeks ago

When fitting atmospheric models with added extinction, would it be possible to report (in the corner plot, to screen, etc.) the reddened bolometric luminosity of the photospheric part only? Currently, species gives the intrinsic L_bol, which is physical and practical, but for comparing to other analyses that might or might not include extinction, it would be also practical to have the "observed" L_bol, however not including the extra black body component if there is one. Of course, the user could do this "by hand" by integrating the reddened atmospheric models but that is a bit involved…

Thanks if this is possible and/or easy!

tomasstolker commented 2 weeks ago

There is a function for that :-). You can use integrate_spectrum of ReadModel and provide a dictionary with model parameters as input. For creating the posterior of the reddened luminosity, you can use get_samples and then run a for loop over all samples and call integrate_spectrum for each sample.

gabrielastro commented 2 weeks ago

Ooh! Sorry and thanks. (To be complete, I guess it would be good if the docstring of integrate_spectrum() mentioned that if extinction parameters are present, the bolometric luminosity will be the extincted one; it currently says that it is the same as from Teff and radius "in principle" but that is an obscure way of refering to extinction :wink:. Also, the mention "unless the atmospheric model had not properly converged" is meaningful only in some contexts.)

Once I have a for loop calculating the reddened Lbol of every sample, are there specific species tools to calculate the median, ±1 sigma region, etc.?

tomasstolker commented 2 weeks ago

The percentiles can be calculated as np.percentile(luminosity_list, q=[16, 50, 84]) (see docs)

gabrielastro commented 2 weeks ago

Thanks! Something like the following?

samples = database.get_mcmc_spectra(tag='mytag', random=123, wavel_range=None, spec_res=300)

# set up the arrays: extincted, intrinsic
logLbolext = np.zeros(len(samples))
logLbolint = np.zeros(len(samples))

for i in range(len(samples)):
    logLbolext[i] = read_model.integrate_spectrum( samples[i].parameters )

    # use "deepcopy" to avoid changing the samples
    pardict = deepcopy(samples[i].parameters)

    # use "….pop(…, None)" to avoid errors if the parameters were not present
    pardict.pop('ism_ext',None)
    pardict.pop('ism_red',None)
    logLbolint[i] = read_model.integrate_spectrum( pardict )

# get median and +- 1 sigma region
prot = np.percentile(logLbolext, q=[16, 50, 84])
pint = np.percentile(logLbolint, q=[16, 50, 84])

print(' logLbol,extincted = %.2f +%.2f -%.2f dex' %(prot[1],prot[2]-prot[1],prot[1]-perc[0]))
print(' logLbol,intrinsic = %.2f +%.2f -%.2f dex' %(pint[1],pint[2]-pint[1],pint[1]-pint[0]))

For the benefit of the others, feel free to correct this code for style or speed… And ultimately, maybe the option of an extra row and colum in the corner plot could be a sensible one :wink:, especially since one could see the correlations. It would also have the advantage of using directly the full sampling of the posterior; I called this code twice with 1234 samples (more than one might normally use for plotting, for instance) and got:

 logLbol,extincted = -3.66 +0.03 -0.03 dex
 logLbol,intrinsic = -3.43 +0.04 -0.06 dex

and

 logLbol,extincted = -3.67 +0.03 -0.02 dex
 logLbol,intrinsic = -3.43 +0.05 -0.06 dex

i.e., I-need-to-go-through-the-paper-again-and-make-numbers-match kind of differences… :slightly_smiling_face:.

tomasstolker commented 2 weeks ago

Looks good! You have become familiar with Python I see :-).

The spectra are interpolated twice (i.e. with get_mcmc_spectra and get_mcmc_spectra), but with ~1000 spectra that might still be fine.

Instead of using get_mcmc_spectra, you could also use get_samples and set random=1234, which simply returns and array with the parameters.

gabrielastro commented 2 weeks ago

Thanks for the feedback… Well, it did take me long-ish to come up with that and it does not quite feel elegant.

get_samples() sounds better! One has however to juggle around to build a dictionary with the parameter names and their values for every call of the interpolation function, right? (I would try it a bit later.)

Indeed already with 1000 spectra it was taking some time, and I am worried about the variance… Are those not good reasons for a built-in feature :wink:?

tomasstolker commented 2 weeks ago

The get_samples() function returns a SamplesBox, which also contains a list with the parameter names, so it should hopefully be straightforward to create parameter dictionaries from that. The for loop should then go over all the rows in the array with samples.

gabrielastro commented 2 weeks ago

Hmm, now this is getting advanced for me… This is as far as I got, and playing around did not help:

samplesbox = database.get_samples(tag='mytag', random=1234)

# save array of parameter names once
pars = samplesbox.parameters

for i in len(samplesbox.samples[:,0]):
    # The following DOES NOT WORK after because it leads
    #  e.g. to {'teff': (2017.5539000965173,), …}, i.e., values with trailing commas
    pardict = dict(zip(pars, zip(samplesbox.samples[i,:])))

P.s.: I renamed pars to pardict in the little code segment above.

tomasstolker commented 2 weeks ago

Perhaps something like this? I didn't test

samplesbox = database.get_samples(tag=Aapetkt, random=1234)
pars = samplesbox.parameters

for i in range(samplesbox.samples.shape[0]):
    pardict = {}
    for j, par in enumerate(pars):
        pardict[par] = samplesbox.samples[i, j]