rhayes777 / PyAutoFit

PyAutoFit: Classy Probabilistic Programming
https://pyautofit.readthedocs.io/
MIT License
60 stars 11 forks source link

Desired Behaviour of Unfinished SearchGridSearch 2 #397

Closed Jammy2211 closed 2 years ago

Jammy2211 commented 2 years ago

I am running the following script and ctrl + c'ing half way through the grid search and commenting out the line:

grid_search_result = grid_search.fit(
    model=model, analysis=analysis, grid_priors=[model.gaussian.centre], parent=parent
)

https://github.com/Jammy2211/autofit_workspace_test/blob/main/database/directory/grid_search.py

This puts me in a position where I am attempting to scrape a grid search result which is unfinished.

If completed_only=False and we attempt to build a half finished grid search, what is the desired behaviour?

Ideally, we would have all results accessible to us, using the aggregator. I think this might actually be the case already, albeit I cant check due to the issue in the other issue on this yet.

I think the more difficult thing we need to think about is the GridSearchResult, which is a pickle file output separately. From what I can tell, this is output and updated at the end of every grid search. So, if 2/4 GridSearch's have finished, it will have results like the log_likelihood for 2 of the searches.

Ideally, If a GridSearch is unfinished we would still have access to all of the GridSearchResult's attributes. For example, in the GridSearchResult , the property log_likelihoods_native would still be available, however there would simply be None's in place when a result is unfinished.

The reason for this is we have long suepr computer runs using grid searches where on-the-fly analysis is hugely benefitial.

Jammy2211 commented 2 years ago

The first issue I have hit is when I load a set of grid search results which are unfinished (e.g. with completed_only=False) the initial aggregator works as expected (e.g., if I swap completed_only=True the length of the aggregator decreases by 1, because it no longer has the grid search result).

However, the following line which I use to filter out all results except the grid search returns an aggregator with a length of 0:

agg_grid_searches = agg.grid_searches()

This works fine when the grid search is complete.

Jammy2211 commented 2 years ago
grid_search_result = list(agg_grid_searches)[0]['result']
print(grid_search_result.best_result)

Gives the following error:

  File "database/directory/grid_search.py", line 157, in <module>
    print(grid_search_result.best_result)
  File "/mnt/c/Users/Jammy/Code/PyAuto/PyAutoFit/autofit/non_linear/grid/grid_search/result.py", line 157, in best_result
    or result.log_likelihood > best_result.log_likelihood
TypeError: '<' not supported between instances of 'float' and 'NoneType'
Jammy2211 commented 2 years ago

As does the following line, which surprises me given that it does not directly use the best_result:

print(grid_search_result.log_evidences_native)
rhayes777 commented 2 years ago

How are you getting this? I can't seem to replicate it