Open yubau1112 opened 5 years ago
I think
[bwt_restore_sa] SA-BWT inconsistency: primary is not the same. Abort!
is your main issue. I think this is because you are missing .fna and/or proper bwa index.
I think
[bwt_restore_sa] SA-BWT inconsistency: primary is not the same. Abort!
is your main issue. I think this is because you are missing .fna and/or proper bwa index.
Hi, @tmhmxg59 thanks your reply yes, I think so
/local_work/bin/bwa-0.7.17/bwa mem -t $NCORE "$BIN/../data/chr6/hg38.chr6.fna" "${OUT}.1.fq" "${OUT}.2.fq" | /local_work/bin/samtools-1.9/samtools view -b - | /local_work/bin/xHLA/bin/sambamba sort -m $MEM -t $NCORE -o "${OUT}.full.bam" /dev/stdin
$BIN/../data/chr6/hg38.chr6.fna
xHLA does not provide hg38.chr6.fna file
so this hg38.chr6.fna file require download by myself?
I used iGeome provided human chromosome 6 reference
https://support.illumina.com/sequencing/sequencing_software/igenome.html
ftp://igenome:G3nom3s4u@ussd-ftp.illumina.com/Homo_sapiens/NCBI/GRCh38/Homo_sapiens_NCBI_GRCh38.tar.gz
And used bwa index human chromosome 6 reference
And then change get-reads-alt-unmap.sh file bwa reference path.
/local_work/bin/bwa-0.7.17/bwa mem -t $NCORE "/home/yubau/human_reference_genome/BWAIndex_CHR6/chr6.fa" "${OUT}.1.fq" "${OUT}.2.fq" | /local_work/bin/samtools-1.9/samtools view -b - | /local_work/bin/xHLA/bin/sambamba sort -m $MEM -t $NCORE -o "${OUT}.full.bam" /dev/stdin
Can work! results like the following message:
[M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 20378 sequences (3058734 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (4, 6867, 0, 1) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (144, 184, 235) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 417) [M::mem_pestat] mean and std.dev: (193.09, 68.00) [M::mem_pestat] low and high boundaries for proper pairs: (1, 508) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_process_seqs] Processed 20378 reads in 6.224 CPU sec, 0.528 real sec [main] Version: 0.7.17-r1188 [main] CMD: /local_work/bin/bwa-0.7.17/bwa mem -t 12 /home/yubau/human_reference_genome/BWAIndex_CHR6/chr6.fa /home/yubau/HLA-B16_S14_BWAmem.sort.xHLA.bam.1.fq /home/yubau/HLA-B16_S14_BWAmem.sort.xHLA.bam.2.fq [main] Real time: 1.187 sec; CPU: 6.462 sec
but still get WARNING like
*****WARNING: Query M04622:111:000000000-G3CWM:1:2103:7820:5734 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping.
I get those warnings too. I'm not sure if they are significant or not. I tried resorting and removing unpaired reads (not sure if I was successful), but the warnings still appeared. It didn't seem to affect the outcome.
I got the [bwt_restore_sa] SA-BWT inconsistency: primary is not the same. Abort!
error message too and figured out that my problem was that the docker container on https://hub.docker.com/r/humanlongevity/hla does not contain the full reference set for bwa, the file hg38.chr6.fna.bwt
was truncated. Also the image contains an outdated version of the reads-alt-unmap.sh
script.
Since I needed a singularity image for the cluster I am working at I converted the docker container to singularity and updated the two files in the process. Here is the singularity recipe I used: xHLA.srecipe.txt
Hi,
1.I used
NCBI Human Genome Resources Reference Genome Sequence https://www.ncbi.nlm.nih.gov/projects/genome/guide/human/
GRCh38 GRCh38_latest_genomic.fna.gz fasta file ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh38_latest/refseq_identifiers/GRCh38_latest_genomic.fna.gz
as BWA Reference Genome instruction : bwa index ref.fa
Program: bwa (alignment via Burrows-Wheeler transformation) Version: 0.7.17-r1188
than I used BWA to do sequence alignment (my HLA NGS data fastq is pair-end) instruction : bwa mem ref.fa read1.fq read2.fq > aln-pe.sam
than used samtools to sort and Index
instruction : samtools view -b -S aln-pe.sam > aln-pe.bam samtools sort -m 4096M -o aln-pe.sorted.sam -@ 12 aln-pe.sam samtools index aln-pe.sorted.sam
samtools --version samtools 1.9 Using htslib 1.9 Copyright (C) 2018 Genome Research Ltd.
than used xHLA bin/get-reads-alt-unmap.sh to convert bam file, but I get results like follow messages:
[yubau@CNUH-i7 bin]$ get-reads-alt-unmap.sh aln-pe.sorted.bam aln-pe.sorted_xHLA_getreasds.sam NCORE=12 MEM=187GB
sambamba 0.6.8 by Artem Tarasov and Pjotr Prins (C) 2012-2018 LDC 1.10.0 / DMD v2.080.1 / LLVM6.0.1 / bootstrap LDC - the LLVM D compiler (0.17.4) sambamba 0.6.8 by Artem Tarasov and Pjotr Prins (C) 2012-2018 LDC 1.10.0 / DMD v2.080.1 / LLVM6.0.1 / bootstrap LDC - the LLVM D compiler (0.17.4)
[ ]Writing sorted chunks to temporary directory... [=======================================================================================================================================]=========] WARNING: Query M04622:111:000000000-G3CWM:1:1101:12467:2013 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping. WARNING: Query M04622:111:000000000-G3CWM:1:1101:13761:14426 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping. *****WARNING: Query M04622:111:000000000-G3CWM:1:1101:17424:18089 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping.
[==============================================================================] WARNING: Query M04622:111:000000000-G3CWM:1:2103:7820:5734 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping. WARNING: Query M04622:111:000000000-G3CWM:1:2103:9073:15589 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping. [bwt_restore_sa] SA-BWT inconsistency: primary is not the same. Abort!
sambamba 0.6.8 by Artem Tarasov and Pjotr Prins (C) 2012-2018 LDC 1.10.0 / DMD v2.080.1 / LLVM6.0.1 / bootstrap LDC - the LLVM D compiler (0.17.4)
/local_work/bin/xHLA/bin/get-reads-alt-unmap.sh: line 43: 46328 Aborted (core dumped) /local_work/bin/bwa-0.7.17/bwa mem -t $NCORE "$BIN/../data/chr6/hg38.chr6.fna" "${OUT}.1.fq" "${OUT}.2.fq" 46329 Done | /local_work/bin/samtools-1.9/samtools view -b - 46330 Done | /local_work/bin/xHLA/bin/sambamba sort -m $MEM -t $NCORE -o "${OUT}.full.bam" /dev/stdin
3.what is "*****WARNING: Query M04622:111:000000000-G3CWM:1:1101:12467:2013 is marked as paired, but its mate does not occur next to it in your BAM file. Skipping." mean?
thanks for your time!