ropensci / babette

babette is an R package to work with BEAST2
https://docs.ropensci.org/babette
GNU General Public License v3.0
44 stars 6 forks source link

freqParameter distribution request #96

Closed GaryNapier closed 3 years ago

GaryNapier commented 3 years ago

Is your feature request related to a problem? Please describe. In the Beauti Priors there is a parameter freqParameter, which is something to do with the Site Model (when I change the site model to JC69, the freqParameter goes away).

There doesn't seem to be any options to change this in Babette, but I have tried the code below (which I'm not sure is pertaining to the same thing).

Describe the solution you'd like Once the GTR model is selected, the default in Beauti is a Uniform distribution on the freqParameter prior, but there are options for other distributions and their corresponding parameters.

Screen Shot 2021-03-24 at 13 10 32

Describe alternatives you've considered I was hoping to be able to change this in Babette, but there seems to be no option to do this. The closest I have found is:

gamma_site_distr <- create_log_normal_distr(value = 0.25, lower = 0.0, upper = 1.0, m = 1, s = 1.25)
create_gamma_site_model(gamma_cat_count = 4, gamma_shape_prior_distr = gamma_site_distr)
site_model <- create_gtr_site_model(gamma_site_model)

However when I run this I get the error Error: beautier::is_id(id) is not TRUE.

I'm not sure at this point if I'm even on the right lines in terms of changing the right parameter in the right way (for example I'm not too sure what "gamma_cat_count" is or what I should change it to - I just know it has to be greater than 2, so I presume 4 is the number of nucleotides).

Here is my full code:

rm(list=ls())

remotes::install_github("ropensci/beautier")

library(babette)
library(seqinr)

remove_tail <- function(x, sep = "_", del = 1){
  sapply(strsplit(x, split = sep, fixed = TRUE),
         function(i) paste(head(i, -del), collapse = sep))
}

# DIRECTORIES

setwd("~/Documents/transmission/")

fasta_dir <- "fasta/"
metadata_local_dir <- "metadata/"

# FILES
fasta_file <- paste0(fasta_dir, "THAILAND_TEST.clust_1.dated.fa")
fasta_id_date_df_outfile <- paste0(metadata_local_dir, remove_tail(basename(fasta_file), sep = "."), ".txt")

# READ IN FILES
fasta <- read.fasta(file = fasta_file, forceDNAtolower = F)

# Get fasta sample names and parse into name and date
fasta_names <- names(fasta)

fasta_id_date_df <- do.call(rbind, lapply(strsplit(fasta_names ,"_"), function(x){
  data.frame(id = paste(x[1:(length(x))], collapse = "_"), year = x[length(x)])
  }))

# Save df of ids and year as csv for model
write.table(fasta_id_date_df, file = fasta_id_date_df_outfile, quote = F, row.names = F, col.names = F, sep = "\t")

# SITE MODEL
gamma_site_distr <- create_log_normal_distr(value = 0.25, lower = 0.0, upper = 1.0, m = 1, s = 1.25)
gamma_site_model <- create_gamma_site_model(gamma_cat_count = 4,
                                            gamma_shape_prior_distr = gamma_site_distr)

site_model <- create_gtr_site_model(gamma_site_model)

# CLOCK MODEL
clock_rate <- 0.0000001
clock_model <- create_strict_clock_model(clock_rate_param = create_clock_rate_param(value = clock_rate),
                                         clock_rate_distr = create_log_normal_distr(value = clock_rate, m = 1, s = 1.25))

# MCMC
every <- 10000

mcmc <- create_mcmc(
  chain_length = 1e+08,
  tracelog = beautier::create_tracelog(log_every = every),
  screenlog = beautier::create_screenlog(log_every = every),
  treelog = beautier::create_treelog(log_every = every)
)

# TREE PRIOR
tree_prior <- create_ccp_tree_prior(
  pop_size_distr = create_log_normal_distr(
    m = 1,
    s = 1.25,
    value = 100.0,
    lower = 0.0,
    upper = 200.0
  ))

# MRCA PRIOR
mrca_prior <- create_mrca_prior(
  is_monophyletic = TRUE, 
  mrca_distr = create_laplace_distr(mu = 1990), 
  name = "TreePrior")

# MAKE XML
create_beast2_input_file(
  fasta_file,
  "beast_xml/THAILAND_TEST.babette.xml",
  site_model = site_model,
  clock_model = clock_model,
  tree_prior = tree_prior,
  mrca_prior = mrca_prior,
  mcmc = mcmc,
  tipdates_filename = fasta_id_date_df_outfile, 
  beauti_options = beautier::create_beauti_options(nucleotides_uppercase = T)
)

Additional context

Please let me know if I've overlooked how to do this with Babette!

As with my other issues, absolutely no rush for this!

Apologies if this is actually a bug rather than a feature request.

Many thanks

richelbilderbeek commented 3 years ago

Hi @GaryNapier, thanks again for another Issue!

The error Error: beautier::is_id(id) is not TRUE. appears to be an overprotective assert statement somewhere to me.

I'll take a look this weekend, then notify you of a new version to test :grin:

GaryNapier commented 3 years ago

Great, thanks again Richel!

richelbilderbeek commented 3 years ago

The error was is

site_model <- create_gtr_site_model(gamma_site_model)

this implicitly calls:

site_model <- create_gtr_site_model(id = gamma_site_model)

Now there is a proper error message.

GaryNapier commented 3 years ago

Ah yes, silly mistake by me!

Ok - I've been testing my code on the latest version and everything seems to work.

Thanks again for all your help!

richelbilderbeek commented 3 years ago

Thanks for testing and making me add a more helpful error message :+1: