ckoerber / lsqfit-gui

Graphical user interface for performing Bayesian Inference (Bayesian fits).
https://lsqfitgui.readthedocs.io
MIT License
1 stars 0 forks source link

Save prior #2

Closed ckoerber closed 3 years ago

ckoerber commented 3 years ago

Add the possibility to export the prior specified in the sidebar into something easily pasteable in another python script. For example a json string.

This was suggested by @walkloud.

Suggestions for the output format

Here I see two options

pure JSON as tuples

{"key1": [0.5, 0.2]}

This will be the most plain and accurate way but requires some parsing on the python import side

JSON as strings

{"key1": "0.5(0.2)"}

Some accuracy loss but easier to directly pars.

Python like strings

import gvar as gv
prior = {"key1": gv.gvar(0.5, 0.2)}

What would you prefer?

Further thoughts

millernb commented 3 years ago

My preference would be yaml. Eg, thinking ahead to correlator fits:

meta_config:
   n_states: 2
   t_start: 7
   t_end: 18
prior:
   E0: 0.7(0.1)
   dE: [0.2(0.1), 0.2(0.2)]
   A_ss: [3(3)e-5, 3(3)e-5, 3(3)e-5]
   A_ps: [0(3)e-5, 0(3)e-5, 0(3)e-5]

Assuming the file is saved as fit_args.yaml, you can load meta/prior with

import yaml
import gvar as gv

with open('fit_args.yaml') as f:
    yaml_file = yaml.safe_load(f)
    meta_config = yaml_file['meta_config']
    prior = gv.gvar(yaml_file['prior'])

The downside is that pyyaml/gvar prefers string formatting when creating a gvar from this file. For example, if you instead save meta/prior like this

meta_config:
   n_states: 2
   t_start: 7
   t_end: 20
prior:
   E0: [0.7, 0.1]
   dE: [[0.2, 0.1], [0.2, 0.2]]
   A_ss: [[3.0e-5, 3.0e-5], [3.0e-5, 3.0e-5], [3.0e-5, 3.0e-5]]
   A_ps: [[0, 3.0e-5], [0, 3.0e-5], [0, 3.0e-5]]

then you would instead need to run the following to load the prior.

import yaml
with open('fit_args.yaml') as f:
    yaml_file = yaml.safe_load(f)
    meta_config = yaml_file['meta_config']

    prior = gv.BufferDict() # extra stuff below
    for key in yaml_file['prior']:
        prior[key] = gv.gvar(yaml_file['prior'][key])

(Also, pyyaml is a bit finicky when it comes to scientific notation -- 3.0e-5 is treated as a float, but 3e-5 is treated as a string; obviously this issue is circumvented if you use string-formatted gvars instead.)

However, we'd probably need to do the latter if we want to also save the prior covariance matrix.

ckoerber commented 3 years ago

As there appear even more choices than initially thought, it seems like eventually the interface should make it possible for the user to select the format 😅

ckoerber commented 3 years ago

I have started to work on the export feature on the export-prior-branch.

The infrastructure is set up and it should be relatively straight forward to implement different formats. However, it seems that the yaml version I am using is printing a lot of overhead for the respective classes. Do you already have a good encoder for this, @millernb?

See also the relevant file: https://github.com/ckoerber/lsqfit-gui/blob/export-prior-widget/lsqfitgui/frontend/widgets/export_prior.py

millernb commented 3 years ago

Do you already have a good encoder for this, @millernb?

I think so, somewhere. I'll dig it out in a bit.

millernb commented 3 years ago

Example prior:

import gvar as gv
import numpy as np

prior = gv.BufferDict()
prior['dE'] = gv.gvar(np.arange(1, 5), np.arange(1, 5))
prior['E0'] =  gv.gvar(1, 1)

Save prior as string/load-in

import yaml
import gvar as gv

# save prior
def save_prior(prior):
    def gv_dict_to_str(gv_dict):
        output = {}
        for key in gv_dict:
            if hasattr(gv_dict[key], '__len__'):
                output[key] = [str(g) for g in gv_dict[key]]
            else:
                output[key] = str(gv_dict[key])
        return output

    output = gv_dict_to_str(prior)

    with open('./prior.yaml', 'w') as file:
        yaml.dump(output, file, default_flow_style=None, sort_keys=False)

    return None

# load prior
def load_prior():
    with open('./prior.yaml') as f:
        yaml_file = yaml.safe_load(f)
        prior = gv.gvar(yaml_file)

    return prior

This has the desired yaml output

dE: [1.0(1.0), 2.0(2.0), 3.0(3.0), 4.0(4.0)]
E0: 1.0(1.0)

Save prior as floats

import yaml
import gvar as gv

# save prior
def save_prior(prior):
    def gv_dict_to_float(gv_dict):
        output = {}
        for key in gv_dict:
            if hasattr(gv_dict[key], '__len__'):
                output[key] = [[g.mean, g.sdev] for g in gv_dict[key]]
            else:
                output[key] = [gv_dict[key].mean, gv_dict[key].sdev]
        return output

    output = gv_dict_to_float(prior)

    with open('./prior.yaml', 'w') as file:
        yaml.dump(output, file, default_flow_style=None, sort_keys=False)

    return None

# load prior
def load_prior():
    with open('prior.yaml') as f:
        yaml_file = yaml.safe_load(f)

        prior = gv.BufferDict() # extra stuff below
        for key in yaml_file:
            prior[key] = gv.gvar(yaml_file[key])
        return prior

In this case, the formatting is slightly different from what I described previously, but it's still a valid yaml file.

dE:
- [1.0, 1.0]
- [2.0, 2.0]
- [3.0, 3.0]
- [4.0, 4.0]
E0: [1.0, 1.0]