gymreklab / GangSTR

A tool for profiling long STRs from short reads
GNU General Public License v2.0
85 stars 16 forks source link

[GangSTR-2.5.0] ERROR: Error extracting read length #107

Closed MeiShu00 closed 3 years ago

MeiShu00 commented 3 years ago

Hi, i encountered the following error when running GangSTR on nanopore reads: [GangSTR-2.5.0] ERROR: Error extracting read length

My command line is as follows: GangSTR --bam gu_ccpp_mm2_RG_SM.sorted.bam --ref ../../GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --regions gangSTR_str.bed --out gu_ccpp_mm2_RG_SM_

Looking forward to your reply!

clarakim11 commented 3 years ago

Hello, I have encountered the same error as @MeiShu00 while running GangSTR on cancer WGS samples. I look forward to hearing back from you about this issue!

nmmsv commented 3 years ago

Apologies for the late response. GangSTR only works with paired-end short reads and does not work with nanopore data. Another assumption that GangSTR makes is that the dataset only has one read length on all reads. @clarakim11 can you confirm that this assumption holds for your dataset?

clarakim11 commented 3 years ago

Thank you so much for getting back to me! Yes, I am using PCAWG paired-end short reads bam files, which I believe have a read length of 100bp. Please correct me if I'm wrong!

nmmsv commented 3 years ago

I have not worked with this dataset so I'm not sure. You can confirm the read length distribution by extracting checking the bam file. Can you check to see if all the reads are the same length? Please also make sure that the reference that you're using matches what your bam file is aligned to (version and annotation, i.e., chrX vs X). Hope this helps.

clarakim11 commented 3 years ago

Thank you so much for your help. The chromosome annotation was the issue. I will make sure to check these more carefully next time. Thank you again!

nmmsv commented 3 years ago

no problem, please let me know if I could help with anything else!

seunghun23 commented 2 years ago

Hi, I'm experiencing the same issue of "Error extracting read length" with bam files I have. It is Illumina paired-end short-read whole genome sequenced data aligned to hg19, and what's strange is that I've tried running older version of GangSTR on the same bam file about 3 years ago, and it worked fine at that time. I'm using version 2.5.0, and it is working fine with CRAM files

AyushSafar commented 1 year ago

Hi, i encountered the following error when running GangSTR on nanopore reads: [GangSTR-2.5.0] ERROR: Error extracting read length how did you fix that problem