dfm / emcee

The Python ensemble sampling toolkit for affine-invariant MCMC
https://emcee.readthedocs.io
MIT License
1.47k stars 430 forks source link

Documentation for integrated_time still specifies discontinued ``axis`` keyword argument #337

Open ljschumacher opened 4 years ago

ljschumacher commented 4 years ago

General information:

Problem description: The docstring for the integrated_time function still mentions the argument 'axis', which does not seem to be implemented anymore. It would be great if the axis argument could be specified, to avoid having to swap axes before passing to this function.

See lines 47 onwards in autocorr.py

    """Estimate the integrated autocorrelation time of a time series.

    This estimate uses the iterative procedure described on page 16 of
    `Sokal's notes <http://www.stat.unc.edu/faculty/cji/Sokal.pdf>`_ to
    determine a reasonable window size.

    Args:
        x: The time series. If multidimensional, set the time axis using the
            ``axis`` keyword argument and the function will be computed for
            every other axis.
warrickball commented 3 years ago

I just ran into this myself. I was poking around in the hope of possibly open a PR but I realised that I don't understand the autocorrelation well enough to implement something myself. I'm happy to just remove any reference to the axis argument. I don't mind looping over dimensions and walkers myself.

On a related note, though, I noticed that integrated_time seems to expect the time series in a different order to the default chain layout. Specifically, line 85 of autocorr.py unpacks the dimensions as

    n_t, n_w, n_d = x.shape

Assuming n_t, n_w and n_d are the numbers of steps, walkers and dimensions, respectively, then the default chain structure I get is (n_w, n_t, n_d). I think this is the axis-swapping that @ljschumacher referred to above.

dfm commented 3 years ago

The order is actually correct if you use the (preferred) get_chain method instead of the chain property. In the back end, the chain is stored as n_t, n_w, n_d as expected, but the chain property has the "wrong" shape for backwards compatibility:

https://github.com/dfm/emcee/blob/f0489d74250e2034d76c3a812b9afae74c9ca191/src/emcee/ensemble.py#L489

I did at one point try re-implementing the axis feature, but ran into some bookkeeping issues that I don't totally remember... I'd be happy to review a PR that implements this feature, but perhaps in the meantime I should just fix that docstring :D