Illumina / Cyrius

A tool to genotype CYP2D6 with WGS data
47 stars 5 forks source link

Calling on HG00463 results in s/w crash. #2

Open iamh2o opened 4 years ago

iamh2o commented 4 years ago

I downloaded the R1 and R2 fasta files for this sample referenced in the paper. I aligned the reads with sentieon bwa mem, produced a valid BAM/BAI file. When I ran, I ended up with the following crash.

(supersonic) jmajor@kahlo:/locus/data/external_data/research_experiments/investigations/CYP2D6/HG00463$ python ~/wgs_resources/bin/Cyrius/ --reference ~/wgs_resources/data/reference/human/human_g1k_v37_modified.fasta/human_g1k_v37modified.fasta --genome 37 --prefix CYP --outDir ./ --threads 88 --manifest manifest.txt INFO:root:Processing sample HG00463.aligned.deduped.sort at 2020-07-20 05:36:41.476394 Traceback (most recent call last): File "/locus/home/jmajor/wgs_resources/bin/Cyrius/", line 580, in main() File "/locus/home/jmajor/wgs_resources/bin/Cyrius/", line 548, in main bam_name, call_parameters, threads, count_file, reference_fasta File "/locus/home/jmajor/wgs_resources/bin/Cyrius/", line 339, in d6_star_caller raw_cn_call.spacer_cn, File "/locus/data/external_data/research_experiments/wgs_resources/bin/Cyrius/caller/", line 56, in get_cnvtag if exon9_intron4_sites_counter[0][1] >= EXON9_TO_INTRON4_SITES_MIN IndexError: list index out of range `

xiao-chen-xc commented 4 years ago

Hi @iamh2o, the error is due to the fact that you don't have any callable site throughout a big region in the gene, which is probably suggesting that something is wrong. Did you align to the entire genome and use the entire BAM, i.e. without just extracting the CYP2D6 region? You are are using the GRCh37 reference, right? Additionally, this might not be the cause of the issue, but does your BAM contain duplicate reads? We generally recommend that the duplicate reads be kept in the BAM as they tend to be a little bit more accurate for depth assessment.