Closed wisekh6 closed 7 years ago
Can you share the header for your bam file? Just the @SQ fields should be sufficient. samtools view -H bamfile | grep SQ > sqheader.txt
Thank you for your response.
Shown below is my header:
@HD VN:1.5 SO:coordinate @SQ SN:chr1 LN:249250621 @SQ SN:chr2 LN:243199373 @SQ SN:chr3 LN:198022430 @SQ SN:chr4 LN:191154276 @SQ SN:chr5 LN:180915260 @SQ SN:chr6 LN:171115067 @SQ SN:chr7 LN:159138663 @SQ SN:chr8 LN:146364022 @SQ SN:chr9 LN:141213431 @SQ SN:chr10 LN:135534747 @SQ SN:chr11 LN:135006516 @SQ SN:chr12 LN:133851895 @SQ SN:chr13 LN:115169878 @SQ SN:chr14 LN:107349540 @SQ SN:chr15 LN:102531392 @SQ SN:chr16 LN:90354753 @SQ SN:chr17 LN:81195210 @SQ SN:chr18 LN:78077248 @SQ SN:chr19 LN:59128983 @SQ SN:chr20 LN:63025520 @SQ SN:chr21 LN:48129895 @SQ SN:chr22 LN:51304566 @SQ SN:chrX LN:155270560 @SQ SN:chrY LN:59373566 @SQ SN:chrM LN:16571 @PG ID:bwa PN:bwa VN:0.7.15-r1140 CL:/home/bin/bwa mem -M -t 10 /home/AmpliconArchitect/bundle/data_repo/hg19_from_AA/hg19/hg19full.fa example.bwa.end1.fq example.bwa.end2.fq
The headers look ok to me.
How sparse is your coverage? AA randomly samples 1000 10kbp windows spread across the genome to estimate the coverage and discards windows with 0 coverage. So if the reads are very sparse or highly localized, then AA will not be able to estimate the coverage.
Do you want to try a test bam file? https://www.ncbi.nlm.nih.gov/sra/?term=SRR4009231
$ prefetch SRR4009231 $ sam-dump SRR4009231 | samtools view -bS - > test.bam $ samtools index test.bam
You can use the following line in the input bed file: chr7 54828001 56118000
Thanks for your comments. My test bam that failed to run AA was created with reads aligned on a specific chromosome. Of course, the input bed file only had regions on that chromosome. Do you think AA failed because this type of test bam has reads only on a specific chromosome?
Thank you,
That seems to be the case. You can use these 2 additional options to force it to calculate coverage only from that chromosome.
Here cbam can be the same as your input bam file. cbed should be a bed file with 1000 random intervals of size 10kbp on the chromosome you are working with.
Great. Got it.
Thank you,
Hello,
I tried to run AA with my example sequencing data, but I have the following error. I will appreciate it if you provide what this error message means and how I can work around.
Thank you in advance,
[root:INFO] #TIME 0.360 Loading libraries and reference annotations for: hg19 [root:INFO] #TIME 2.820 Initiating bam_to_breakpoint object for: /home/tmp/example.bwa.aln.bam 2.82 0 1001 401 0.0 Traceback (most recent call last): File "src/AmpliconArchitect.py", line 186, in <module> bamFileb2b = b2b.bam_to_breakpoint(bamFile, coverage_stats=cstats, coverage_windows=coverage_windows, downsample=args.downsample, sensitivems=(args.sensitivems=='True')) File "/home/tmp/src/bam_to_breakpoint.py", line 106, in __init__ self.median_coverage(window_list=coverage_windows) File "/home/tmp/src/bam_to_breakpoint.py", line 280, in median_coverage wc_median.append(wc_ws_filter[len(wc_ws_filter) / 2]) IndexError: list index out of range