Using BXA chains for error analysis

pboorm commented 1 year ago

BXA version: 4.0.5
UltraNest version: 3.3.3
Python version: 3.8.12
Xspec version: 12.12.1
Operating System: macOS Monterey 12.6

Description

When using the BXA-generated chain files in Xspec after running a BXA fit, Xspec cannot estimate uncertainties from the chain with e.g., "flux 2. 10. err 1000". The issue seems to be that the BXA chain file column headings are incompatible with the model used.

What I Did

The model fit was:

Model constant<1>*TBabs<2>(zTBabs<3>*cabs<4>*zpowerlw<5> + zgauss<6>) Source No.: 1   Active/On
Model Model Component  Parameter  Unit     Value
 par  comp
   1    1   constant   factor              1.00000      frozen
   2    2   TBabs      nH         10^22    4.00000E-02  frozen
   3    3   zTBabs     nH         10^22    37.1630      +/-  0.0          
   4    3   zTBabs     Redshift            5.00000E-02  frozen
   5    4   cabs       nH         10^22    37.1630      = p3
   6    5   zpowerlw   PhoIndex            1.88462      +/-  0.0          
   7    5   zpowerlw   Redshift            5.00000E-02  frozen
   8    5   zpowerlw   norm                1.37403E-03  +/-  0.0          
   9    6   zgauss     LineE      keV      6.41373      +/-  0.0          
  10    6   zgauss     Sigma      keV      8.41334E-02  +/-  0.0          
  11    6   zgauss     Redshift            5.00000E-02  frozen
  12    6   zgauss     norm                1.58220E-05  +/-  0.0

And the free parameter numbers in the BXA fit were: 3, 6, 8, 9, 10, 12. The corresponding chain.fits file generated by BXA had column headings (note the nH prior was added out of order of the other parameters):

PhoIndex__6, log(norm)__20, log(nH)__27, LineE__45, log(Sigma)__58, log(norm)__72, FIT_STATISTIC

The problem was fixed by manually changing the BXA chain file column headings, column orders and parameter units to match an equivalent Xspec-generated MCMC chain file:

Columns = nH__3, PhoIndex__6, norm__8, LineE__9, Sigma__10, norm__12, FIT_STATISTIC
Units = 10^22, , , keV, keV, , C-Statistic

JohannesBuchner commented 1 year ago

I don't quite understand how the numbers work.

Here is the current code: https://github.com/JohannesBuchner/BXA/blob/master/bxa/xspec/solver.py#L64

Previous issues: https://github.com/JohannesBuchner/BXA/issues/12 https://github.com/JohannesBuchner/BXA/issues/10

There is probably a way to get the number out from the model and parameter in the transformation list.

pboorm commented 1 year ago

Thanks! I had a go at tweaking the store_chain() function. A version I wrote that appears to work for models with one data group is:

def store_chain_universal(chainfilename, transformations, posterior, fit_statistic):
    """Writes a MCMC chain file in the same format as the Xspec chain command."""
    import astropy.io.fits as pyfits

    group_index = 1
    old_model = transformations[0]['model']

    indices = [t["index"] for t in transformations]
    names = []
    for t in transformations:
        if t['model'] != old_model:
            group_index += 1
        old_model = t['model']
        original_parname = AllModels(1)(t["index"]).name
        names.append('%s__%d' % (original_parname, t['index']))# + (group_index - 1) * old_model.nParameters))

    columns = [pyfits.Column(
        name=name, format='D', unit=AllModels(1)(transformations[i]["index"]).unit, array=t['aftertransform'](posterior[:, i]))
        for i, name in enumerate(names)]
    columns = list(np.array(columns)[np.argsort(indices)])

    columns.append(pyfits.Column(name='FIT_STATISTIC', format='D', array=fit_statistic))
    table = pyfits.ColDefs(columns)
    header = pyfits.Header()
    header.add_comment("""Created by BXA (Bayesian X-ray spectal Analysis) for Xspec""")
    header.add_comment("""refer to https://github.com/JohannesBuchner/""")
    header['TEMPR001'] = 1.
    header['STROW001'] = 1
    header['EXTNAME'] = 'CHAIN'
    tbhdu = pyfits.BinTableHDU.from_columns(table, header=header)
    tbhdu.writeto(chainfilename, overwrite=True)

This saves the column names as the original parameter names (e.g., before adding extensions around the parameter name that depend on the prior) as well as the units stored in the model parameters. However, the code is using "AllModels(1)", so will not work for models with multiple datagroups/models. I've left the corresponding code commented that I think was being used to account for this in the original version (i.e. here: https://github.com/JohannesBuchner/BXA/blob/631e9f40f101aba781f89f1d1e53ffcfe9262519/bxa/xspec/solver.py#L68).

When I load the chain generated with this function, the error on flux can be computed from the chain.

JohannesBuchner / BXA

Using BXA chains for error analysis #42

Description

What I Did