Closed npsonis closed 2 years ago
Hmm, that's odd. Are you sure you're using the option correctly? It expects a filename as argument, like -p my_plink_file_prefix
. You can check the online help using pileupCaller --help
. Have I perhaps made a mistake in the documentation somewhere?
Yes my filename is POP. It works with -e POP.
Could you please post here the output of pileupCaller --help
?
Usage: pileupCaller [--version]
(--randomHaploid | --majorityCall [--downSampling] |
--randomDiploid) [--keepIncongruentReads]
[--seed
samtools mpileup -B -q30 -Q30 -l <BED_FILE> -R -f <FASTA_REFERENCE_FILE>
Sample1.bam Sample2.bam Sample3.bam | pileupCaller ...
You can lookup what these options do in the samtools documentation. Note that flag -B in samtools is very important to reduce reference bias in low coverage data.
This tool is part of sequenceTools version 1.5.1
Available options:
--version Print version and exit
-h,--help Show this help text
--randomHaploid This method samples one read at random at each site,
and uses the allele on that read as the one for the
actual genotype. This results in a haploid call
--majorityCall Pick the allele supported by the most reads at a
site. If an equal numbers of alleles fulfil this,
pick one at random. This results in a haploid call.
See --downSampling for best practices for calling
rare variants
--downSampling When this switch is given, the MajorityCalling mode
will downsample from the total number of reads a
number of reads (without replacement) equal to the
--minDepth given. This mitigates reference bias in
the MajorityCalling model, which increases with
higher coverage. The recommendation for rare-allele
calling is --majorityCall --downsampling --minDepth 3
--randomDiploid Sample two reads at random (without replacement) at
each site and represent the individual by a diploid
genotype constructed from those two random picks.
This will always assign missing data to positions
where only one read is present, even if minDepth=1.
The main use case for this option is for estimating
mean heterozygosity across sites.
--keepIncongruentReads By default, pileupCaller now removes reads with
tri-allelic alleles that are neither of the two
alleles specified in the SNP file. To keep those
reads for sampling, set this flag. With this option
given, if the sampled read has a tri-allelic allele
that is neither of the two given alleles in the SNP
file, a missing genotype is generated. IMPORTANT
NOTE: The default behaviour has changed in
pileupCaller version 1.4.0. If you want to emulate
the previous behaviour, use this flag. I recommend
now to NOT set this flag and use the new behaviour.
--seed
OK, the issue is that the -e
flag means --eigenstratOut
, while the -p
flag means --plinkOut
. You can't have both of these at the same time. When you try without the -e POP
flag this command above should work.
The following command should work with the testData provided in the repository:
cd test/testDat
pileupCaller --sampleNames 12880A,12881A,12883A,12885A --randomHaploid --singleStrandMode -f 1240k_eigenstrat_snp_short.snp.txt -p testOut < AncientBritish.short.pileup.txt
Sorry for reopened the thread, but the problem remains. This works: pileupCaller --randomHaploid --sampleNameFile Kinship_IDs.txt --samplePopName POP -e POP -f 1240K.snp < 1240K.pileup This doesn't: pileupCaller --randomHaploid --sampleNameFile Kinship_IDs.txt --samplePopName POP -p POP -f 1240K.snp < 1240K.pileup
I copied the command and just changed the letter e to p, to make sure that no typing error exists.
The error remains the same as I wrote previously: Invalid option `-p' Did you mean one of these? -h -d -f -e
However, this does works properly, as you propose: cd test/testDat pileupCaller --sampleNames 12880A,12881A,12883A,12885A --randomHaploid --singleStrandMode -f 1240k_eigenstrat_snp_short.snp.txt -p testOut < AncientBritish.short.pileup.txt
EDIT!!! I just figure out that if I change the order of the arguments it works: pileupCaller --randomHaploid --sampleNameFile Kinship_IDs.txt -p POP -f 1240K.snp --samplePopName POP < 1240K.pileup
So, the problem is with --samplePopName preceding -p.
Yes, I understand now. Indeed, there is an implied order in these arguments, because --samplePopName
makes only sense with either -p
or -e
set. I agree this is a bit weird and intransparent. I can easily fix that. I reopen this issue to remind myself.
This is fixed in v1.5.2
Hi again,
I am running pileupCaller with -p POP enabled but I get the following error
Invalid option `-p' Did you mean one of these? -h -d -f -e
The same if I use: --plinkOut POP
It works with -e. If neither is used I get: Missing: (-e|--eigenstratOut)
Perhaps -p is not implemented yet?
Version: 1.5.1