Closed denisroy1 closed 6 years ago
Hi, Genepopedit tends to like Genepop files with the loci in a vertical column under your heading, as opposed to flat across the file. I ran your input file through PGDSpider from Genepop to Genepop and it worked with Genepopedit. I've attached the fixed file here.
Note that you want keep=T in subset Genepop if you want to keep just that vector of loci you added, and when I set keep=T I noticed that 'loci_2201' is not in your genepop file. If you want just those ~45 loci removed, keep should =F. Also it is best not to read a Genepop file in as read.table as you did, just allow genepopedit's functions to read in the genepop file in their own way. Here is my script for your file:
setwd("to whatever you want") library(genepopedit)
subloci<-c("loci_18", "loci_19", "loci_285", "loci_431", "loci_492", "loci_629", "loci_697", "loci_869", "loci_949", "loci_1072", "loci_1124", "loci_1135", "loci_1294", "loci_1295", "loci_1670", "loci_1671", "loci_1860", "loci_1881", "loci_2056" , "loci_2220", "loci_2250", "loci_2269", "loci_2357", "loci_2437" , "loci_2588", "loci_2755", "loci_2756", "loci_2945", "loci_2948", "loci_2995" , "loci_3105", "loci_3267", "loci_3306", "loci_3333", "loci_3356", "loci_3498" , "loci_3510", "loci_3616", "loci_3692", "loci_3693", "loci_3703", "loci_3766" , "loci_3796", "loci_3820", "loci_3848")
loci<-genepop_detective("NewGP_Fixed.txt",variable="Loci")
setdiff(subloci,loci)
subset_genepop(genepop = "NewGP_Fixed.txt",subs=subloci,keep = T,path="C:/Users/JefferyN/Desktop/NewGP2.txt")
Hopefully that helps!
Hi Nick,
Thanks for looking at this for me so quickly and resolving the issue. If I process the file in PGDSpider using genepop->genepop I can enter and use genepopedit. This is great. The only issue I still have is trying to write the output to the "GENETIX" file format using the PGDspideR cmd.
PGDspideR(input = nfn, input_format="GENEPOP", output = ffn, output_format="GENETIX", spid="/Users/Denis/Documents/genpop-genetix.spid", where.pgdspider="/Genetic Programs/PGDSpider_2.1.0.2/")
I've triple-quadruple checked and these are the correct paths for my files. The error messages are below:
I guess I could do it by hand using the PGD interface, but I have 120 files to process and it would be better to push them through the program. This may be a lost cause, and I'll just ghave to bite the bullet.
In any case, thanks for getting to work for me, this is invaluable.
Best cheers!
Hi Denis,
pgdspideR in genepopedit worked for me using the new Genepop file I sent you. It is best to type out the full paths for both your input and output files which might be the problem, rather than using a shortcut (what you're calling nfn and ffn). Not sure which operating system you're using, but the following script works for me - I'll re-attach the fixed Genepop file. Your .spid file likely was made in Pgdspider and just has the SNP options selected I assume? Also your output should have .gtx extension for genetix format.
setwd("C:/Users/JefferyN/Desktop") library(genepopedit) PGDspideR(input = "NewGP_Fixed.txt",input_format = "GENEPOP",output = "C:/Users/JefferyN/Desktop/Genetix_Input.gtx",output_format = "GENETIX",spid = "C:/Users/JefferyN/Documents/GP_GTX.spid",where.pgdspider = "C:/Users/JefferyN/Documents/Programs/PGDSpider_2.0.8.3/")
Hope that helps, good luck with your work.
Adding to Nick's comments I think the issue might be associated with incomplete file paths: Try:
PGDspideR(input = nfn, input_format="GENEPOP", output = ffn, output_format="GENETIX", spid="C:/Users/Denis/Documents/genpop-genetix.spid", where.pgdspider="C:/Genetic Programs/PGDSpider_2.1.0.2/")
Note I am not sure where the folder 'Genetics Programs' is. If it is in C:/Users/Denis ... you will want to change that in the code. If errors propagate after this change then we can see if it is truly a permissions error or something else. Not having the .spid might cause the remaining cascade errors. Also if you can rename the folder 'Genetics Programs' to 'Genetics_Programs' it will probably help. CMD commands can have errors when with spaces in file paths.
Hi, I'm trying to use these scripts to filter out specific SNPs from a series of ~ 4000. I have a genepop file formatted as follows:
fcsnp10_99.txt
The scripts seemingly read in the data correctly, but clearly something is wrong with the data format as it cannot read the PopNames, PopCounts, or the LociNames correctly. If you have some time, do you mind taking a look at the file and script. Thanks alot for any help. The script I used is below:
rm(list=ls())
re-setting all instances
library(genepopedit)
setwd("path to my file") infile<-file.choose() outd<-getwd()
gp210_99 <- read.table(infile,sep="\t",quote="",stringsAsFactors=FALSE)
PopNames <- genepop_detective(gp210_99,variable="Pops") PopCounts <- genepop_detective(gp210_99, variable="PopNum") LociNames <- genepop_detective(gp210_99,variable="Loci")
subloci<-c("loci_18", "loci_19", "loci_285", "loci_431", "loci_492", "loci_629", "loci_697", "loci_869", "loci_949", "loci_1072", "loci_1124", "loci_1135", "loci_1294", "loci_1295", "loci_1670", "loci_1671", "loci_1860", "loci_1881", "loci_2056" , "loci_2201", "loci_2220", "loci_2250", "loci_2269", "loci_2357", "loci_2437" , "loci_2588", "loci_2755", "loci_2756", "loci_2945", "loci_2948", "loci_2995" , "loci_3105", "loci_3267", "loci_3306", "loci_3333", "loci_3356", "loci_3498" , "loci_3510", "loci_3616", "loci_3692", "loci_3693", "loci_3703", "loci_3766" , "loci_3796", "loci_3820", "loci_3848")
subset_genepop(genepop=gp210_99, keep = F, subs = subloci, path = paste0(output_dir,"/","fcsnp10_99_neut.txt"))