thierrygosselin / assigner

Population assignment analysis using R
http://thierrygosselin.github.io/assigner
GNU General Public License v3.0
17 stars 6 forks source link

Error: object "MARKERS" not found #11

Closed mfisher5 closed 6 years ago

mfisher5 commented 6 years ago

I am trying to re-run some analyses that I did a few months ago, using assigner with gsi_sim in RStudio on a linux machine. After installing assigner and gsi_sim from within RStudio and then attempting to run the function assignment_ngs, I am receiving the error message:

Error in mutate_impl(.data, dots) : Evaluation error: object 'MARKERS' not found.

I have tried removing and re-installing assigner and gsi_sim, as well as each required package. I have checked my genepop file to make sure that the error is not in my input file, and all seems fine. I'm at a bit of a loss as to what the error message means; any help is much appreciated! I attached the code that I am working with for reference.

Assigner_wGSIsim_linux.txt

thierrygosselin commented 6 years ago

Thanks for reporting the potential bug Will have a look at the codes tomorrow

mfisher5 commented 6 years ago

What was the last version of R that was used to test assigner? Wondering if my recent upgrade has something to do with the error.

thierrygosselin commented 6 years ago

I checked your code:

##-- Load all necesary libraries after installation:
library(devtools)
library(assigner)
library(stackr) # may have to install from github
library(assigner)
library(reshape2)
library(ggplot2)
library(stringr)
library(stringi)
library(plyr)
library(dplyr) # load this package after plyr to work properly
library(tidyr)
library(readr)
library(adegenet)
library(randomForestSRC)
library(doParallel)
library(foreach)
library(purrr)
library(utils)
library(iterators)
############################################################################

# ------------------- Assignment Test: THL (Training, Hold-out, Leave One Out) Method w/ assignment_ngs ALL 9 POPULATIONS ---------------------- #
#
##-- set working directory
setwd("/mnt/hgfs/PCod-Korea-repo/analyses/Assignment")

##-- load in genepop file
all <- read.genepop("../../stacks_b8_verif/batch_8_filteredMAF_filteredIndivids30_filteredLoci_filteredHWE_filteredCR.gen")
summary(all)

##-- run assigner: ngs
system.time(assign.all.9pops <- assignment_ngs(data = all, 
                                               assignment.analysis = "gsi_sim", 
                                               sampling.method = "ranked",
                                               thl = 0.5, 
                                               iteration.method = 10,      #default = 10
                                               subsample = NULL,           #randomly select X individs from each pop to use for entire simulation (can be used to smooth out sample sizes)
                                               iteration.subsample = 1,    #default = 1
                                               marker.number = c(10,50,100,200,500,1000,2000,5000,"all"),
                                               common.markers = TRUE,
                                               #pop.levels = NULL,          #default = NULL, alphabetical / numerical. ordering of pops
                                               #pop.levels = NULL,          #default = NULL. to rename or combine your pops
                                               #pop.select = NULL,          #default = NULL. to select a subset of pops
                                               verbose = TRUE,             #default = false. get more info when f(x) is running
                                               folder = "verif_byreg_thl0-5", 
                                               filename = "assignment_all_byreg_thl0-5.txt",
                                               keep.gsi.files = TRUE,
                                               random.seed= NULL, 
                                               parallel.core = 5))

Couple of things to make things work smoothly

Load assigner

library(assigner) # no need to load all the librairies

Set working directory

setwd("/mnt/hgfs/PCod-Korea-repo/analyses/Assignment")

If you read assigner::assignment_ngs function documentation, you'll see what are the supported file format. A genepop file, is fine, but not the way you import it into R... I guess you tried with adegenet::read.genepop function...

Try instead to put your genepop file directly as input file and use the strata argument to name correctly your 9 populations or sampling sites. How to make your strata file ? Check function documentation, but quickly it's a file with 2 columns: INDIVIDUALS and STRATA, where STRATA is any grouping you like to test: sampling sites, discovered pop, etc. INDIVIDUALS must match what's in your genepop file.

gen.file <-"../../stacks_b8_verif/batch_8_filteredMAF_filteredIndivids30_filteredLoci_filteredHWE_filteredCR.gen"

system.time(
  assign.all.9pops <- assigner::assignment_ngs(
    data = gen.file,
    strata = "PUT YOUR STRATA FILE HERE",
    assignment.analysis = "gsi_sim", 
    sampling.method = "ranked",
    thl = 0.5, iteration.method = 10, #default = 10
    subsample = NULL,                 #randomly select X individs from each pop to use for entire simulation (can be used to smooth out sample sizes)
    iteration.subsample = 1,          #default = 1
    marker.number = 200, # first test
    # marker.number = c(10,50,100,200,500,1000,2000,5000,"all"),
    common.markers = TRUE,
    verbose = TRUE,                    #default = false. get more info when f(x) is running
    folder = "verif_byreg_thl0-5", 
    filename = "assignment_all_byreg_thl0-5.txt",
    keep.gsi.files = TRUE,
    random.seed= NULL, 
    parallel.core = 5))

Let me know how it works, re-open the issue if you still have a problem Thierry

mfisher5 commented 6 years ago

It works when I read the genepop file directly into the function, rather than as an adegenet genind object. Thank you!!