Open williamjameshandley opened 4 years ago
is it possible to use a new type of object inside the pandas dataframe? Gives something smaller than any float, including -np.inf
class PriorZero(object):
def __eq__(self, other):
return isinstance(other, PriorZero)
def __lt__(self, other):
return True
def __gt__(self, other):
return False
def __le__(self, other):
return True
def __ge__(self, other):
return self == other
If not possible, similar but inherit from np.float64
class PriorZero(np.float64):
def __eq__(self, other):
return isinstance(other, PriorZero)
def __lt__(self, other):
return True
def __gt__(self, other):
return False
def __le__(self, other):
return True
def __ge__(self, other):
return self == other
not sure these are neccessarily the best way forward, just the only other possibilities I could imagine beyond the -np.inf and np.nan.
Hi @andrewfowlie. This is definitely possible and would achieve the same idea but with more robust treatment than nan. My concern is that even if derived from np.float, the pandas logL_birthcolumn would still have dtype object, and we would lose all the fast vectorised operations which make Python a reasonable choice for anesthetic.
I see. I’m not sure if my suggestion is the best way. But if performance is the only concern, might be worth quickly checking that the cost is as big as you fear it might be.
In nested sampling, there is a concept called 'logzero' which is important, but neither well-defined, nor consistently implemented.
In general, at runtime if a likelihood returns 'logzero' (which in MultiNest and PolyChord is user-configurable, defaulting a large floating point number
-1e30
), then nested sampling interprets this as 'excluded at the prior level', which has a non-trivial effect on the evidence. Small likelihoods larger than this e.g.-1e29
effectively do not participate in the posterior, but don't act to truncate the prior volume.There are two concepts here
As MultiNest and PolyChord are written in FORTRAN, which does not have a platform-independent implementation of 'infinity' or 'nan', the best they can do in that language is to have this slight hack of defining a threshold float below which values are 'prior zero' and have no exact notion of 'likelihood zero', but get away with small values of the loglikelihood a little above the threshold.
We can't do anything to change FORTRAN, but anesthetic can probably do better than this to avoid gotchas, and support a better notion with languages that have nans and infs. At the moment anesthetic assume np.inf represents zero prior, and still has the slight hack of 'very low likelihood' being equivalent to 'zero likelihood'.
One proposition which I would like to discuss is to define
This potentially has issues associated with the fact that nan is greater, less and equal to nothing (including itself), which would mean that any mask on logL_birth (for which the initial live points would have NaN in that column) would have to be carefully defined, but in my view this is unlikely to affect users.
@andrewfowlie, @lukashergt, @appetrosyan, thoughts and suggestions are welcome.
This has come up in a host of past issues #70, #71, #120, #122, Cobaya#77. Feel free to tag more that I have undoubtedly missed, and anybody else who might be interested in this issue.