rhayes777 / PyAutoFit

PyAutoFit: Classy Probabilistic Programming
https://pyautofit.readthedocs.io/
MIT License
59 stars 11 forks source link

Database support for CombinedAnalysis #705

Closed Jammy2211 closed 1 year ago

Jammy2211 commented 1 year ago

The following example script shows that general database functionality works for a normal Analysis object:

https://github.com/Jammy2211/autofit_workspace_test/blob/main/database/directory/general.py

The following script shows that for a combined analysis, it does not work.

https://github.com/Jammy2211/autofit_workspace_test/blob/main/database/directory/general.py

save_attributes_for_aggregator

The function save_attributes_for_aggregator is not called for a CombinedAnalysis and therefore the custom .pickle files are not output (e.g. data.pickle).

The main functionality that needs to be implemented is therefore associated with this comment at the end of the script:

"""
__Data__

Loading data via the aggregator, to ensure it is output by the model-fit in pickle files and loadable.

In the `general.py` example this provides a list of `data` objects for every model-fit performed (in this example a 
list with a single entry for the single model-fit performed).

For a `CombinedAnalysis` this should provide a list of `data` objects for every model-fit performed, but where
every entry in the list is a list of `data` objects corresponding to each `data` used in each individual `Analysis`.

There is an example in the comment below.
"""
def _data_from(fit: af.Fit):

    data = fit.value(name="data")

    return data

data_gen = agg.map(func=_data_from)

print("Data via Data Gen:")
print([data for data in data_gen])

# If 5 model-fits are performed using CombinedAnalysis, each with 3 datasets, this should return the second data '
# of the 4th model fit.

# print(data_gen[3][1])
Jammy2211 commented 1 year ago

Okay, so the function save_attributes_for_aggregator is actually called and the appropriate .pickle files are output.

It is simply the loading of these things via the aggregator that needs to be implemented.

rhayes777 commented 1 year ago

So digging into this it's because an aggregator aggregates searches not analyses. We can add additional metadata to each analysis to allow it to aggregate on those instead.

It looks like the aggegator expects: .metadata .complete pickles/model.pickle

probably more...

Jammy2211 commented 1 year ago

It is worth checking if .metadata is still used for anything, or a remnant of an old implementation,

rhayes777 commented 1 year ago

It's for the aggregator basically

Jammy2211 commented 1 year ago

The .complete is whether the non linear search is complete as well, so should not be on a per-analysis basis but for the overall search.

The pickles folder requires more thought, some pickles are probably on a per-analysis level (e.g. the data.pickle is the data fitted by each Analysis). Something like the model.pickle is probably the same for all Analysis objects though? Or does it change depending on things like the with_free_parameters thing we do with Analysis objects?