Added HPD calculation to ensemble

shoepfl commented 1 month ago

The ensemble from_sample method now has an additional rel_cutoff argument. rel_cutoff allows the posterior to be cut to the alpha % highest density (here, non-normalized posterior probability).

I checked the code with my workflow, and it worked. Anyway, I would be happy to receive suggestions for improvement.

Adresses #1117

codecov-commenter commented 1 month ago

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.

Project coverage is 83.44%. Comparing base (bd4b81b) to head (ca2bb38).

Files with missing lines	Patch %	Lines
pypesto/ensemble/ensemble.py	90.90%	2 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## develop #1431 +/- ## =========================================== + Coverage 83.41% 83.44% +0.02% =========================================== Files 161 161 Lines 13460 13482 +22 =========================================== + Hits 11228 11250 +22 Misses 2232 2232 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dilpath commented 1 month ago

@shoepfl Thanks very much!

@PaulJonasJost any thoughts on how to consolidate the two different cutoff methods? With this PR, using rel_cutoff with Ensemble.from_optimization_* will give chi2-based cutoffs, but Ensemble.from_sample will give hpd-based cutoffs. Ideally, the user would be able to choose. Should we create some Cutoff class that can be used like so?

class Cutoff:
    ...
    def __call__(self, xs, fs): # or rather pass a list of vectors and their fvals?
        return self.apply_cutoff(xs=xs, fs=fs)
    @abstractmethod
    def apply_cutoff(self, xs=xs, fs=fs):
        ...  # specific rel/chi2/hpd cutoff method here
class RelCutoff(Cutoff): ...
class Chi2Cutoff(Cutoff): ...
class HpdCutoff(Cutoff): ...

cutoff = Chi2Cutoff(percentile=...)

ensemble = Ensemble.from_sample(result=result, cutoff=cutoff)

and inside Ensemble.from_sample:

def from_sample(..., cutoff: Cutoff):
    ...
    x_vectors = cutoff(xs=result.sample_result.trace_x[0, burn_in:, :], fs=result.sample_result.trace_fval[0, burn_in:])
    ...

We could also integrate things like remove_burn_in and chain_slice (see Ensemble.from_sample) into Cutoff.

PaulJonasJost commented 1 month ago

@shoepfl Thanks very much!

@PaulJonasJost any thoughts on how to consolidate the two different cutoff methods? With this PR, using rel_cutoff with Ensemble.from_optimization_* will give chi2-based cutoffs, but Ensemble.from_sample will give hpd-based cutoffs. Ideally, the user would be able to choose. Should we create some Cutoff class that can be used like so?
class Cutoff:
    ...
    def __call__(self, xs, fs): # or rather pass a list of vectors and their fvals?
        return self.apply_cutoff(xs=xs, fs=fs)
    @abstractmethod
    def apply_cutoff(self, xs=xs, fs=fs):
        ...  # specific rel/chi2/hpd cutoff method here
class RelCutoff(Cutoff): ...
class Chi2Cutoff(Cutoff): ...
class HpdCutoff(Cutoff): ...

cutoff = Chi2Cutoff(percentile=...)

ensemble = Ensemble.from_sample(result=result, cutoff=cutoff)
and inside Ensemble.from_sample:
def from_sample(..., cutoff: Cutoff):
    ...
    x_vectors = cutoff(xs=result.sample_result.trace_x[0, burn_in:, :], fs=result.sample_result.trace_fval[0, burn_in:])
    ...
We could also integrate things like remove_burn_in and chain_slice (see Ensemble.from_sample) into Cutoff.

I think as it is currently used rel_cutoff here might be renamed. If i understand your suggestion correctly, the main advantage would be a reduction in direct parameters when calling Ensemble.from_...with a potential easier overview? Removing burn in and chain slice are quite sample specific, i am not entirely sure whether from optimization could straightforward be used as equivalents. But the idea itself sounds good.

shoepfl commented 1 month ago

@PaulJonasJost, a test sounds good. Indeed, it would be nice if you could assist here, as there may already be similar test cases in the toolbox that we could add to.

PaulJonasJost commented 1 month ago

@PaulJonasJost, a test sounds good. Indeed, it would be nice if you could assist here, as there may already be similar test cases in the toolbox that we could add to.

I will look into it and write a test if you are fine with me editing the branch 👍🏼

shoepfl commented 1 month ago

yes of course, and if I can help you can just assign me some tasks

PaulJonasJost commented 4 weeks ago

added the test and renamed. Now i guess someone else has to review (@dilpath)?

ICB-DCM / pyPESTO

Added HPD calculation to ensemble #1431

Codecov Report