ISUgenomics / SequelTools

new repo
GNU General Public License v3.0
27 stars 6 forks source link

Crash running QC mode #16

Open J-Calvelo opened 3 years ago

J-Calvelo commented 3 years ago

Hello, I'm running into some issues trying to run SequelTools for quality controls

Beginning quality control function


Running in WITH_SCRAPS mode
Traceback (most recent call last):
  File "/home/amanda/programas/SequelTools/Scripts/generateReadLenStats_wScraps.py", line 93, in <module>
    szData = lineLst[2].strip().strip("sz:")
IndexError: list index out of range
ERROR: Calculation of read length statistics failed!
Running in NO_SCRAPS mode
Traceback (most recent call last):
  File "/home/amanda/programas/SequelTools/Scripts/generateReadLenStats_noScraps.py", line 94, in <module>
    start = int(coord.split("_")[0]); stop = int(coord.split("_")[1])
ValueError: invalid literal for int() with base 10: 'ccs'
ERROR: Calculation of read length statistics failed!

Not sure of what could be the cause. Thanks

aseetharam commented 3 years ago

@J-Calvelo: Thanks for trying SequelTools! From the error message, it seems like you are not using the actual scrap file (the parsed tags were missing in the file, I think?). If you are certain that you used the scraps bam file, would it be possible for you to share few lines of the bam file? The output of this command, perhaps?

 samtools view your.bam | head -n 1

Thanks,

J-Calvelo commented 3 years ago

Hello, I figured it out some hours after posting it. It was indeed a problem with the bam file, I used picard to convert from fastq to bam. Sorry for the inconvinience

senzei-21 commented 2 years ago

Hello there, I have encountered the same error when running QC without scrap file.

Running in NO_SCRAPS mode
Traceback (most recent call last):
  File "/home/threadripper/Downloads/SequelTools/Scripts/generateReadLenStats_noScraps.py", line 94, in <module>
    start = int(coord.split("_")[0]); stop = int(coord.split("_")[1])
ValueError: invalid literal for int() with base 10: 'ccs'
ERROR: Calculation of read length statistics failed!

This is the first line of my bam file

m64047_220526_230626/0/ccs  4   *   0   255 *   *   00  ACACTAGATCGCGTGTTGAATTGGTGTACTCAATTTACATTTAAACACAATCAATAGTGAGGACGGATAACAACGCAATGAATTCAAGAACACCACATATATTTACAAAAGCGCTGCAGCTCGCCCCAAGCATTTGTAAATATTTGTGTTTTTTTTTTTTGCTAACCTTGGCGCCATGACAAATGACCGTCAGCTTTATATCGATCGTAAGACCGAGGAAGCAGATTACATCTACCGGGCTGATCGATCTGGTCACTGTTCATATACACTACTGGATTGTCATATTTTAGCATTTTGGCGATGACCATCCCATTAGTTTCCAGAGCCGTGGACGGTCGAGTCTCTCGGTCGAGTCTCTCGGTCGAGTCTCTCGGTCGAGTCTCTCGGTCGAGTCTCTCGGTCGAGTATCCTAACAGGCTAACACAACACTTGCAATACGTCACATTTTTTTATAAATCGACAGCTCGTTCGGAAAGCAGTGTTCGCTCTAGTCTTCTGCGCGAGTAAGAAACTTTTTAAGTTAAATTATACACAATCATAATCAGACACGAACCATTATATTAAGAGACAGCCAGGTAGATGTTGAGCTGGTTATACCTATCCTCTTAATATTCGGAGATCGTATAAAACCCTTTTTTATAAGATATTTAAATTTACTTACGGAAAAATTCTGTATATTTATAGGGACCTTAATGAACGTGCGCGCACAGAAAGCCCCATAAAATAATTTTTGCCTTTATGTAAAGGACTGACGGGAAAACATAGTTTATCTTTATTTGTAATATTGTAATTGTGTCGATAGTGATTCTATCTAAATATTCAATATTTGTTTTGACATACATTAAATTTTTATAACGACCTGTGATGCCAAGGTCAAGACTCCATCTTTCCAAATTTAAGTCTAAGAGATACTCGAGGTCTTAGCTTGTAAATTTCCCGAACGGTCCGCTTCTGTATCTATGAACATCGGTGACATGCCACCAAATCAAGGTGCCGTATGTAATCAGGCTATGGAAATAGCCAAAGTAAACTAATCTAGTAGTTTGAATATCTTGTAACCTCTAGTTGCTTGTTTTATAATATTTGAGGAAACGAAGTGAACAATTTTTATTGTCACCCCGGTAGAACTCGAGACACGATATACAAATATCAGGGCGTTGTGTAGTACTTATGACCTCGTAGACATTAAAGTTTCTATATAAGTTTAATAAGCGCAATACCTCTTAGTATTTTAAATAATTATTAAGTACACAGTATTCCTGTTAACGTTTATACAAGAATTAAAAAGGGCACCACCCGGTTTCAAAATGTGTCGAGAATAAATTGTTTTACTTAGAGCGTGCAAGCAACATGCAAAACATGCAACATCCAGACCCGCCTCTGACCCTCGCATTCGTTTCACCACTGCGAATCTCTATCCAGATTATTGGATACTGATTCGAGAAGATCCACGAATTCGAACTCCTTGAGCAAGCGTTAAGAAGATTATTTAGCCAAAATCCGCCACCAAGAAGGTGGGTTTGCATGCTGCATGTTGCTTGCACGCTCCAAGTAATGGATTACCGCAAATTAAAATATACAAATTATAATTTATATTATTTGTTTTACTTATGTAAAATTAAAAACAATGTACAATGAAAACAAGTAAAAGATTTTTTCAGACGTTTGTTATCCTTCCTGTAGAATTTGGGCGTGTCGACTTCGATTTAAGCACCGACCTTCTTTCCAAATTACTGAAATTCGGACTGTTAGTTTCGGTCTGCCTTCGAAGATTACAAGAAATCTGCATGTCAGAACAAAAGCGTGCCTCCATTTTAATGTTATATATTATAATTTCAAAACAATTCCAAATTGACAGGTTTTAATTCTTGGTGGCGGATTTTGGCTAAATAATCTTAACGCTTGCTCTAGGAGTTCGAATTTGTGGATCTTCTCAAATCAACATCCAATAATTTGGATAGGGATTTGCAGTGGTGAAACGAATTCGAGGGTTAGAGGCGGGTTTGCATGCTGCATGTTTTGCATGTTGCTTGCTTGCTCTTCAAAACAATTCGCGAACAGATAAACAAAGCGCTTCGCGAAGTATGATTTTTATTTTAACATACGAAACCGAATTAATTTATGATAATGTCGTAAAGTGTATAGCGTCTAGGGGAGAGCGGGGCGCTGCGGGACGTCACAGATACACAAGACAAAGGGATCGTTTCGGTAAGGTCATAAAACTGTAGTGCGCGCTGGGCGGGCGGGCGACGGGGCGGGGGCGCGGAGGCGGGACGCGCTACGGCGCTTGTATTGCTTTATTGAGTGGAATATCTCCTTTCCGAGCTGACTCGCAGATTTATTATTCATTTATTTATAACTTACGCTGGCGTGTGCGCGGTGGCGCCCGGTGCACGTCGGTTCTTGGATTTTATATTTCGTAATTTTATTTTTCAGCGTTTTGTAGTATGTCTGAGCTGATGAAACCAGATGACAAATGACGTCTGCCTTGTGACGGCGCGTCGTACCTCTAAGAACCGAATGATCAGTTCAGTGTATTTTTGAATAGAGAAAATAGAAGGAAAAGTAGAAAAAATTATAAGCGAAAAATGATTATTTATTTTATAACACGTAATGAATATTATCATTACAGAACACGCGATCTAGTGT   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Q~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Z~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~m~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~   RG:Z:ebc74f4b   ac:B:i,59,1,60,0    ec:f:59.8636    ma:i:0  np:i:60 rq:f:1  sn:B:f,12.4602,18.3831,3.90846,7.87343  we:i:7800687    ws:i:77168  zm:i:0

I used the bam file ouput in Q20 folder given by the sequencer. May I know if this tool suitable for PacBio HiFi dataset?

aseetharam commented 2 years ago

@senzei-21 It looks like you are running SequelTools on CCS/HiFi reads - unfortunately, this tool only works on subreads. Please try this on your subreads and let us know if you still have the issue.

Thanks,

senzei-21 commented 2 years ago

I managed to run this tool on my subreads. But my raw data doesn't have any scrap files so I can't proceed with the filtering option.