ocbe-uio / BayesMallows

R-package for Bayesian preference learning with the Mallows rank model.
https://ocbe-uio.github.io/BayesMallows/
GNU General Public License v3.0
21 stars 9 forks source link

DRY SMC functions #114

Closed wleoncio closed 1 year ago

wleoncio commented 3 years ago

There are several repeated function bits that would be better off merged into their own functions. Look for functions with similar names, like smc_mallows_new_users_* and metropolis_hastings_aug_*.

anjastein commented 2 years ago

I agree, I think this can be resolved using several conditional statements, we could incorporate the following input variables into some functions:

osorensen commented 2 years ago

Some more DRY issues based on feedback from a user (!!):

This is confusing. My preferred solution would be that the SMC-Mallows functions returned objects of class SMCMallows or something similar, whereupon we could call plot() in the same way as we do for BayesMallows objects. I need to dive a bit deeper into this to understand how we best would package the output from the SMC functions.

Something like this:

wleoncio commented 2 years ago

Sounds like a good idea! Maybe plot.SMCMallows has a parameter argument to select between "rho" and "alpha"? Which of these (if any) should be the default?

osorensen commented 2 years ago

Perhaps alpha could be the default, since it's the default for plot.BayesMallows, and also the easiest thing to plot. However, I think we need to first create a function that returns SMCMallows objects, and this probably requires a bit of work. As of now, the user has to provide the correct data manually, but I envision that we instead could create an R function that returns SMCMallows objects containing the data required to do various plots and other outputs.

Currently, with compute_mallows we can do this

m <- compute_mallows(potato_visual)
plot(m, parameter = "rho", burnin = 100)

whereas with SMC-Mallows we have to do stuff like this (pasting from the vignette)

smc_test <- smc_mallows_new_users_complete(
  R_obs = data, n_items = n_items,
  metric = metric, leap_size = leap_size,
  N = N, Time = Time,
  logz_estimate = logz_estimate,
  mcmc_kernel_app = 5,
  num_new_obs = 5,
  alpha_prop_sd = 0.5,
  lambda = 0.15,
  alpha_max = 1e6
)
test_sample_rho <- smc_test$rho_samples[, , Time + 1]
plot_rho_posterior(
  output = test_sample_rho,
  nmc = N, burnin = 0, C = 1,
  colnames = colnames(sushi_rankings)
)
wleoncio commented 2 years ago

I agree that the processes for creating BayesMallows objects and SMCMallows objects should be more similar. Does a wrapper compute_smc_mallows() function for all the smc_mallows_new_X() functions (where X = {new_users_complete, new_users_partial, new_users_partial_alpha_fixed, new_item_rank, new_item_rank_alpha_fixed}) make sense methodologically?

Just for the record, the SMCMallows class is created by the internal function smc_processing(), which is called by the following exported functions: compute_posterior_intervals_rho(), compute_rho_consensus(), plor_rho_posterior().

osorensen commented 2 years ago

@anjastein, what do you think about @wleoncio's last comment? It sounds feasible to me, but I don't understand the SMC algorithm very well, so it would be good to hear what you think.

For example, compute_smc_mallows, could have a type argument or similar, which takes values in X = {new_users_complete, new_users_partial, new_users_partial_alpha_fixed, new_item_rank, new_item_rank_alpha_fixed}, and then return an object of class SMCMallows. If mod is the result of a call to compute_smc_mallows, this would let us type plot(mod), with a parameter argument taking values in {alpha, rho}, and eventually other parameters.

anjastein commented 2 years ago

Hi @osorensen and @wleoncio!

Sorry for my late reply, I was thinking about this over the past week and forgot to respond.

Yes, I agree with what you are both saying. It would be ideal if when estimating models using SMC-Mallows, we could use the plot.BayesMallows functions instead of my plot functions, such as plot_rho_posterior() and plot_alpha_posterior(), which, as @osorensen has said, do the same job. When I was working on my SMC project, I had copied and edited the plot functions from the BayesMallows R package because the object class issue meant that I could not call the original plot functions directly with success.

So, going forward, we probably need to set default values for the input arguments in the SMC-Mallows functions and modify the functions to return an object containing a list of various information, just like in compute_mallows. That way, we can call the existing plot functions in the BayesMallows package. Creating the type argument would definitely avoid duplicating more code!

@wleoncio, just to warn you now rather than later: I think the functions containing the words new_item_rank may need to be altered so that they either accept an MCMC output or a new_users_partial output as the initial particles that start the algorithm as it seems to work much more effectively, but this would cause more code duplication so the wrapper compute_smc_mallows would help a lot. I might need to flag this as a separate PR...

osorensen commented 2 years ago

Thanks @anjastein, then we're on the same page. Seems like we'll have work to do in the next year as well :-)

osorensen commented 1 year ago

Closed by #269.