Open schnuffipc opened 8 years ago
Hello,
When you run snipre_prep.R script, argument # 7 is where you can provide the outgroup.
FYI:
The following are the command-line arguments that must be provided with the snipre_prep.R script...
1) .vcf file
2) starting bee column number (positive integer) of the vcf file (1-based indexing)
3) ending bee column number (positive integer) of the vcf file (1-based indexing)
4) SnpEff file (for Format Specifications, see Part 2)
5) gff file (for Format Specifications, see Part 5)
6) Folder where output files from snipre_prep_bash.sh are located (path name)
7) nout : outgroup x 2
8) npop : population size x 2
Thanks, Sunny.
Hi,
I have read thoroughly the list of inputs (it is basically the only existing documentation). But that doesn't help me to understand in what form the outgroup is given. Outgroup x 2 is not really telling me anything, and cannot possibly be enough for the program for its analysis. It is just a number. Where does it get the snp information of the outgroup? Is it in a .bam file? Is it a part of the .vcf file?
Thanks Pnina
Hi Sunny,
Let me join in to try to explain why we're not understanding you.... We have whole genome sequencing data from a population sample of species A, and we want to use another closely related species B as the outgroup. We have mapped our population sequence data to the reference genome of species A, and we have a VCF file with the genotypes of SNPs in these individuals. We also have a single genome sequenced from species B. If we give the VCF to snipre_prep.R it will only contain the polymorphism information within species A. We need to somehow also supply the information regarding the differences between species A and species B. We don't understand how to give this information to your script.
My guess is that you used this script with population sequence data from species A that was mapped to the reference genome of species B. So the resulting VCF file contains both within species and between species information. Is that correct? This wasn't clear to us from your documentation.
Thanks, Eyal
Hello Eyan and Pnina,
As Eyal correctly figured out, my scripts WERE used with population data from species A mapped to reference genome of species B, and the VCF file reflected this. We had another VCF file for data from species B mapped to reference genome of species A. I am sorry that my documentation is not clear regarding this matter.
In my scripts, the outgroup (nout) and population size (npop) are just numbers, which I feed into the scripts written by Kirsten Eilertson (the actual SnIPRE program, available here: https://bustamantelab.stanford.edu/software).
Thank you, Sunny.
I couldn't find any reference to the outgroup other than 7) nout : outgroup x 2 (positive integer). Say I am using one outgroup, how is your program going to see it? Should it be as a .bam, should it be in the .vcf? Thanks in advance