yuanzunli / kdeLF

A Flexible Method for Estimating Luminosity Functions via Kernel Density Estimation
MIT License
0 stars 1 forks source link

How to get an analytic function of the final LF shape #2

Closed alessandropeca closed 2 years ago

alessandropeca commented 2 years ago

Dear,

Your code kdeLF is able to fit a dataset, and with the MCMC method it is also possible to derive the bandwidth parameters hs and the sensitivity parameter $\beta$.

Is that any way to retrieve an analytic form, i.e., an equation, of the best-fit LF? Having that, it will be possible to compare the obtained LF with other LF available in the literature.

yuanzunli commented 2 years ago

You can try:

from kdeLF import kdeLF 
lf2 = kdeLF.KdeLF(sample_file='data.txt', solid_angle=omega, zbin=[Z1,Z2], f_lim=f_lim, 
                  H0=H0, Om0=Om0, small_sample=False, adaptive=True)
lf2.get_optimal_h() 

You may obtain something like the following:

z & L data loaded
Maximum likelihood estimation by scipy.optimize 'Powell' method,
redshift bin: ( 0.01 , 4.9 )
sample size of this bin: 7872
bandwidths for 2d estimator,
    Initial h1 & h2:      [0.15, 0.15]
    bounds for h1 & h2:   [(0.001, 1.0), (0.001, 1.0)]
    Optimal h1 & h2:
         0.3019 0.1059
pilot bandwidths:
    h1p & h2p: 0.3019 0.1059 

global bandwidths and beta for adaptive 2d estimator,
    Initial h10, h20 & beta:      [0.15 0.15 0.3 ]
    bounds for h10, h20 & beta:   [(0.001, 1.0), (0.001, 1.0), (0.01, 1.0)]
    Optimal h10, h20 & beta:
         0.1151 0.0606 0.2505 

Cost total time: 21.92  second

Out:   array([0.1151027 , 0.06057506, 0.25046887])

The array [0.1151027 , 0.06057506, 0.25046887] is the optimal bandwidths and beta for the adaptive KDE given by Maximum likelihood estimation. You can get the logarithmic LF at any point (z, L) in the domain of your survey by

lf2.log10phi (z, L, theta=[0.1151027 , 0.06057506, 0.25046887])

Of course, if you want to use the result of MCMC fit, just replace the array for theta by the one given by MCMC.

alessandropeca commented 2 years ago

You can try:

from kdeLF import kdeLF 
lf2 = kdeLF.KdeLF(sample_file='data.txt', solid_angle=omega, zbin=[Z1,Z2], f_lim=f_lim, 
                  H0=H0, Om0=Om0, small_sample=False, adaptive=True)
lf2.get_optimal_h() 

You may obtain something like the following:

z & L data loaded
Maximum likelihood estimation by scipy.optimize 'Powell' method,
redshift bin: ( 0.01 , 4.9 )
sample size of this bin: 7872
bandwidths for 2d estimator,
    Initial h1 & h2:      [0.15, 0.15]
    bounds for h1 & h2:   [(0.001, 1.0), (0.001, 1.0)]
    Optimal h1 & h2:
         0.3019 0.1059
pilot bandwidths:
    h1p & h2p: 0.3019 0.1059 

global bandwidths and beta for adaptive 2d estimator,
    Initial h10, h20 & beta:      [0.15 0.15 0.3 ]
    bounds for h10, h20 & beta:   [(0.001, 1.0), (0.001, 1.0), (0.01, 1.0)]
    Optimal h10, h20 & beta:
         0.1151 0.0606 0.2505 

Cost total time: 21.92  second

Out:   array([0.1151027 , 0.06057506, 0.25046887])

The array [0.1151027 , 0.06057506, 0.25046887] is the optimal bandwidths and beta for the adaptive KDE given by Maximum likelihood estimation. You can get the logarithmic LF at any point (z, L) in the domain of your survey by

lf2.log10phi (z, L, theta=[0.1151027 , 0.06057506, 0.25046887])

Of course, if you want to use the result of MCMC fit, just replace the array for theta by the one given by MCMC.

Hi, another quick question: let's say I have objects up to redshift 4. I can still build the kdeLF object like lf2 = kdeLF.KdeLF(sample_file='data.txt', solid_angle=omega, zbin=[0.,6], f_lim=f_lim, H0=H0, Om0=Om0, small_sample=False, adaptive=True) Now I have an estimation up to redshift 6. How much is it reliable? Do you reccomend it?

yuanzunli commented 2 years ago

@alessandropeca We generally set Z2 slightly larger than the maximum redshift of the sample, and set Z1 slightly smaller than the minimum redshift of the sample. In your question, you should set Z2=4.0. If you use Z2=6, you can indeed have an estimation up to redshift 6,but it is not reliable. You should remember that the KDE is a nonparametric method, the estimate far from the data aggregation area is the extrapolation.