Open DiedeMaas90 opened 4 years ago
It is not something I have considered. May I ask in what situtation this could be useful?
I'm using RADseq data, and I was thinking that using multiple SNPs per RAD-tag would perhaps over-emphasize effects. I saw that in STACKS there is also an option of using only one SNP per tag. You also then avoid analysing SNPs that are probably linked, right?
Hi @DiedeMaas90
I'm also using ANGSD with RADseq and other data types with many small contigs. A fairly simple workaround is to run whatever analysis in a first pass, look at the output, thin down that output to one locus per contig, then use that as an input file to reanalyse.
For example, the -doMaf
output looks something like this:
chromo position major minor ref anc knownEM unknownEM nInd
contig00030 1127 A C A A 0.097012 0.034447 6
contig00030 1222 C T C C 0.488717 0.488718 5
You could extract the first two columns, then use whatever data wrangling language to select only one row per chromo
or some other filtering strategy, then use ./angsd sites index sites.file
to index that file, which should be in the format:
contig0030 1127
contig0035 403
contig0052 473
You can then pass this to the initial ANGSD call by including -sites sites.file
which causes ANGSD to only consider the input sites.
See more detail on how ANGSD uses input filters here: http://www.popgen.dk/angsd/index.php/Sites
Hi,
I have a potentially simple question: is it possible to select for a single SNP per contig within ANGSD? For example the first one or a random one? I cannot seem to find the answer online.
Cheers, Diede