bayesmix-dev / bayesmix

Flexible Bayesian nonparametric mixture models in C++
https://bayesmix.rtfd.io
BSD 3-Clause "New" or "Revised" License
22 stars 18 forks source link

Improve run_mcmc output in bayesmixpy #103

Closed TeoGiane closed 2 years ago

TeoGiane commented 2 years ago

According to run_mcmc documentation given in bayesmixpy, the return type consists of four different objects, even though it admits None in case, for instance, dens_grid=None or via specific False flags, as shown in the notebooks.

However, at this stage you always need to specify all four outputs, even though you expect one to be None. Thus, an error arises when, for instance, you call:

numcluschain, cluschain, bestclus = run_mcmc("NNIG", "DP", data, hier_params, mix_params, algo_params, dens_grid=None) (example from gaussian_mix_uni.ipynb).

The expected outcome, for me, is to get all not-None outputs generated during run_mcmc, if possible. Clearly, is up to the end user give proper names to the variables it will get. A possible solution is to define a class which manages the sampling output and return a single object of that class (thus admitting empty members to manage None values).

brunoguindani commented 2 years ago

This is intended behavior. In fact, it used to work like that, then we agreed to switch to the current version (see here in the discussion on python/bayesmixpy/run.py). The user can simply assign the None return values to the throwaway variable _:

numcluschain, cluschain, bestclus, _ = run_mcmc("NNIG", "DP", data, hier_params, mix_params,
                                                algo_params, dens_grid=None)

Alternatively, you can store all return values in a tuple and access the appropriate position:

out = run_mcmc("NNIG", "DP", data, hier_params, mix_params, algo_params, dens_grid=None)
numcluschain = out[0]

IMO, it's confusing to have a variable amount of return values, as I argued in the thread I linked, but I'm down for discussion.

(We still need to update the notebook if we decide to stick to the current implementation.)

The idea of returning a simple container class is not too bad either, although arguably it's an overcomplication:

def run_mcmc():
  bestclus = 3
  class MCMCResults:
    pass
  out = MCMCResults()
  if bestclus:
    out.bestclus = bestclus
  # else: out.bestclus = None  # ?
  return out

run = run_mcmc()
print(run.bestclus)
TeoGiane commented 2 years ago

The user can simply assign the None return values to the throwaway variable _:

numcluschain, cluschain, bestclus, _ = run_mcmc("NNIG", "DP", data, hier_params, mix_params,
                                                algo_params, dens_grid=None)

Alternatively, you can store all return values in a tuple and access the appropriate position:

out = run_mcmc("NNIG", "DP", data, hier_params, mix_params, algo_params, dens_grid=None)
numcluschain = out[0]

Many thanks @brunoguindani! I was worried that this was not an intended behavior, since the examples you refer to actually doesn't work. This answer is perfect for me, I think we just need to update the tutorial notebooks to be consistent with this behavior.