Closed gustavo-miranda closed 3 years ago
Hi Gustavo,
The way the GATK code is written, at present, requires that you only have a single reference assembly to which you have aligned your individual data. So, you cannot have the code align data for multiple individuals to multiple references.
Hope that helps clarify, -b
Hi Brant,
Yes, it does clarify. Thank you!
Gustavo
Hi Brant,
I am using the seqcap_pop scripts within some of the phyluce tools and I am having a problem with indel calling.
When calling indels, we should provide the path to match-contigs-to-probes fasta file (-R) and to the merged bam file (-I). In my match-contigs-to-probes folder I have several fasta files, one for each individual of SpeciesA. In my merge-bams folder, I have only one single bam file (and its index) with bams of all specimens merged into the same file.
The problem I am having is that when I set the path/to/match-contigs-to-probes/SpeciesA_specimen050.fasta it only reads information of SpeciesA_specimen050 in the bam file and gives me an error for all other specimens. Here is how the output looks like:
Is there a way to use a command that makes GATK reads a file with a list of paths to all fastas making the the program run all specimens at once?
Here are the commands I am using: java -Xmx2g -jar ~/anaconda/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar \ -T RealignerTargetCreator \ -R /path/to/4_match-contigs-to-probes/Genus_species.fasta \ -I /path/to/7_merge-bams/Genus_species.bam \ --minReadsAtLocus 7 \ -o /path/to/8_GATK/Genus_species.intervals
This might be of interest of @mgharvey too.
Thank you. Gustavo