eriqande / rubias

identifying and reducing bias in hierarchical GSI
2 stars 3 forks source link

error in using pi_prior #33

Open AlbertoAbreu opened 1 year ago

AlbertoAbreu commented 1 year ago

I am running rubias on a dataset using mitochondrial DNA and have coded as instructed to specify a haploid system. The program runs perfectly with a set of references from 23 collections (for this trial I am running as if they each belonged to its own repunit), when run against a mixture containing information from 20 collections (to evaluate contributions to all mixtures from each of the 23 reference populations).

I would like to now run rubias with a pi_prior, specifying the different population sizes of each reference. I assumed this could be easily accomplished by setting up a tibble with two columns: collection (names of my 23 collections as used in my data file) and pi_param (the abundance values I mentioned earlier): tibble [23 x 2] (S3: tbl_df/tbl/data.frame) $ collection: chr [1:23] "AG" "BBL" "BBW" "BRB" ... $ pi_param : int [1:23] 203 1504 150 1345 1222 20 25 130 100 15 ...

However, when I try to run infer_mixture, I get this error : Scoring locus CR as haploid Scoring locus CR as haploid Collating data; compiling reference allele frequencies, etc. time: 0.11 seconds Computing reference locus specific means and variances for computing mixture z-scores time: 0.01 seconds Working on mixture collection: ASCfg with 22 individuals Joining, by = "collection" Error in vec_assign(): ! Can't convert from replace$pi_param to data$pi_param due to loss of precision.

I would be grateful for any help in solving my issue.

best, Alberto

eriqande commented 1 year ago

Hello Alberto,

I have tested this a little bit, and it appears to be due to tidyr::replace_na() not wanting to replace numeric values with integers ones.

In the long term, I will add a line of code in the relevant function to coerce any integer inputs to numeric; however in the short term you should be able to solve your problem by making the object you are passing to the pi_prior option a tibble or data frame in which the pi_param column is a numeric and not an integer. For example, by:

my_priors <- my_priors %>%
    mutate(pi_param = as.numeric(pi_param))

Cheers,

eric