Closed Tom-Jenkins closed 6 years ago
Hi Tom, it's very difficult to reproduce without the dataset... can you send me your genind object?
Best Thierry
Hi Thierry,
It's the code below using the file I sent you before:
vcf = read.vcfR("stacks_snpsaurus_seafan_relaxed.vcf") vcf
seafan = vcfR2genind(vcf)
indNames(seafan) = unlist(strsplit(indNames(seafan), split="_sorted.bam")) indNames(seafan)
seafan$pop = as.factor(substr(indNames(seafan), 1, 3)) seafan$pop
convert = genomic_converter(seafan, output=c("genepop","bayescan"))
####################################################################### ##################### radiator::genomic_converter ##################### ####################################################################### Function arguments and values: Working directory: C:/Users/tj248/Dropbox/R nextRAD Radiator Analysis Input file: from global environment Strata: strata_labels.tsv Population levels: no Population labels: no Output format(s): tidy, genepop, bayescan Filename prefix: no Filters: Blacklist of individuals: no Blacklist of genotypes: no Whitelist of markers: no monomorphic.out: TRUE snp.ld: no common.markers: TRUE max.marker: no pop.select: no maf.thresholds: no
Imputations options: imputation.method: no
parallel.core: 3
#######################################################################
Importing data
Error in mutate_impl(.data, dots) : Evaluation error: object 'MARKERS' not found.
Best, Tom
From: Thierry Gosselin notifications@github.com Sent: 24 January 2018 16:31 To: thierrygosselin/radiator Cc: Jenkins, Tom; Author Subject: Re: [thierrygosselin/radiator] Importing genind object using genomic_converter (#16)
Hi Tom, it's very difficult to reproduce without the dataset... can you send me your genind object?
Best Thierry
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/thierrygosselin/radiator/issues/16#issuecomment-360191947, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AUA6s409mrisGpKs2UgwYgYuiOXXjm6fks5tN1rOgaJpZM4Rpm9L.
Oh ok I didn't realize it was this one. On it.
Hi Tom,
Found the problem. Part of it is mine and is fixed in the latest commit, but there is more to it.
bug found
I was not accounting correctly on how the genind object was made via vcfR::vcfR2genind
.
I'm a big fan of vcfR
, and I'm actually using it inside my functions to read the vcf files when the metadata are needed (e.g. GL, PL, read depth info, etc). However, I'm using pegas
when no metadata are needed, because pegas
is a lot faster than vcfR
(but no metadata can be read...).
Other problem
The way you are using vcfR
to generate the genind object generates some downstream problems inside radiator
.
The most obvious one I find is that the markers inside the genind object is building only with the ID
column of the vcf file. This is problematic because you should see those as locus info and they have duplicates (when more than 1 SNP is found on the read). So those are not unique numbers and using this directly inside radiator generates genepop and bayescan files with 2547 markers instead of 6386.
What you want is to combine the CHROM
+ POS
and ID
info in the vcf file to generate the unique marker info. This is what I actually do if you were to use this command to generate your output files:
test <- radiator::genomic_converter(data = "stacks_snpsaurus_seafan_relaxed.vcf", strata = "strata_labels.txt", output = c("genind", "genepop", "bayescan"))
Not convince ?
Try this seafan@loc.fac
and see what you get.
Then try this test$genind.no.imputation@loc.fac
.
FYI: The ID
column in vcf files produced by stacks
since v.1.44 have the LOCUS info with the position of the SNPs on the read/locus separated by _
. The position info is the COL
column in the sumstats file produced by the populations module.
Hope this help Thierry
Hi Thierry,
I've done some analysis in Adegenet. My Genind object contains 93 individuals, 9 populations and 6,386 loci:
/// GENIND OBJECT /////////
// 93 individuals; 6,386 loci; 12,772 alleles; size: 7.4 Mb
// Basic content @tab: 93 x 12772 matrix of allele counts @loc.n.all: number of alleles per locus (range: 2-2) @loc.fac: locus factor for the 12772 columns of @tab @all.names: list of allele names for each locus @ploidy: ploidy of each individual (range: 2-2) @type: codom @call: adegenet::df2genind(X = t(x), sep = sep)
// Optional content @pop: population of each individual (group size range: 8-18)
I want to use the
genomic_converter
function to convert and export the Genind object into genepop and bayescan files. My understanding of the function is that it accepts Genind data from the global environment, however, I keep getting the error below?convert = genomic_converter(seafan, output=c("genepop","bayescan"))
####################################################################### ##################### radiator::genomic_converter ##################### #######################################################################
Function arguments and values: Working directory: C:/Users/tj248/OneDrive - University of Exeter/Exeter University/PhD Project Documents/Pink Sea Fan/NextRAD Project 2017/nextRAD SNP Data Analysis Input file: from global environment Strata: no Population levels: no Population labels: no Output format(s): tidy, genepop, bayescan Filename prefix: no Filters: Blacklist of individuals: no Blacklist of genotypes: no Whitelist of markers: no monomorphic.out: TRUE snp.ld: no common.markers: TRUE max.marker: no pop.select: no maf.thresholds: no
Imputations options:
imputation.method: no
parallel.core: 3
#######################################################################
Importing data
Error in mutate_impl(.data, dots) : Evaluation error: object 'MARKERS' not found.