dieterich-lab / circtools

circtools: a modular, python-based framework for circRNA-related tools that unifies several functionalities in a single, command line driven software.
http://circ.tools
GNU General Public License v3.0
25 stars 20 forks source link

circtools detect / DCC throw ValueError: could not convert string to float: 'chrM' #95

Closed gnilihzeux closed 3 months ago

gnilihzeux commented 3 months ago

Dear author,

I found a similar error with DCC 103. I'd tried all my efforts but could not solve it.

I had tried to install circtools from bioconda and pull down DCC from github. But all did't work.

Thanks

Describe the bug

Traceback (most recent call last):
  File "/home/jinwen/.local/lib/python3.6/site-packages/DCC-0.5.0-py3.6.egg/DCC/circFilter.py", line 48, in readcirc
  File "/home/jinwen/.local/lib/python3.6/site-packages/DCC-0.5.0-py3.6.egg/DCC/circFilter.py", line 48, in <listcomp>
ValueError: invalid literal for int() with base 10: 'chrM'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/DCC", line 33, in <module>
    sys.exit(load_entry_point('DCC==0.5.0', 'console_scripts', 'DCC')())
  File "/home/jinwen/.local/lib/python3.6/site-packages/DCC-0.5.0-py3.6.egg/DCC/main.py", line 369, in main
  File "/home/jinwen/.local/lib/python3.6/site-packages/DCC-0.5.0-py3.6.egg/DCC/circFilter.py", line 50, in readcirc
  File "/home/jinwen/.local/lib/python3.6/site-packages/DCC-0.5.0-py3.6.egg/DCC/circFilter.py", line 50, in <listcomp>
ValueError: could not convert string to float: 'chrM'

My command

DCC @$samplesheet \
        -mt1 @$mate1 \
        -mt2 @$mate2 \
        -T 16 \
        -D \
        -an $ANNO_GTF \
        -Pi \
        -F \
        -R $REPEATS_GTF \
        -Nr 2 1 \
        -t $tmp_dir \
        -G \
        -A $REF_GEN \
        -O $DCC_OUT

Desktop (please complete the following information):

gnilihzeux commented 3 months ago

my this data is not proper for circtools detect / DCC ? My other data works fine. But CIRI2 works fine either. My data throwing error were downloaded from ENA.

By the way, my STAR code:

if [[ -s $trim_fq1 && -s $trim_fq2 && ! -s ${align_bam}Chimeric.out.junction ]]
then
  STAR --runThreadN 16 \
       --genomeDir $STAR_INDEX \
       --outSAMtype BAM SortedByCoordinate \
       --readFilesIn $trim_fq1 $trim_fq2 \
       --readFilesCommand zcat \
       --outFileNamePrefix $align_bam \
       --quantMode TranscriptomeSAM GeneCounts \
       --chimOutType Junctions SeparateSAMold \
       --outReadsUnmapped Fastx \
       --outFilterMismatchNmax 999 \
       --outFilterMismatchNoverLmax 0.05 \
       --outSJfilterOverhangMin 15 15 15 15 \
       --alignSJoverhangMin 15 \
       --alignSJDBoverhangMin 15 \
       --outFilterMultimapNmax 20 \
       --outFilterScoreMin 1 \
       --outFilterMatchNmin 1 \
       --chimSegmentMin 15 \
       --chimScoreMin 15 \
       --chimScoreSeparation 10 \
       --chimJunctionOverhangMin 15 \
       --outSAMmultNmax 1 \
       &> $log
fi
# Read1
if [[ -s $trim_fq1 && -s $trim_fq2 && ! -s ${align_bam_r1}Chimeric.out.junction ]]
then
  STAR --runThreadN 16 \
             --genomeDir $STAR_INDEX \
             --outSAMtype None \
             --readFilesIn $trim_fq1 \
             --readFilesCommand zcat \
             --outFileNamePrefix $align_bam_r1 \
             --outReadsUnmapped None \
             --outFilterMismatchNmax 999 \
             --outFilterMismatchNoverLmax 0.05 \
             --outSJfilterOverhangMin 15 15 15 15 \
             --alignSJoverhangMin 15 \
             --alignSJDBoverhangMin 15 \
             --seedSearchStartLmax 30 \
             --outFilterMultimapNmax 20 \
             --outFilterScoreMin 1 \
             --outFilterMatchNmin 1 \
             --chimSegmentMin 15 \
             --chimScoreMin 15 \
             --chimScoreSeparation 10 \
             --chimJunctionOverhangMin 15 \
             --outSAMmultNmax 1 \
             &> $log_r1
fi
# Read2
if [[ -s $trim_fq1 && -s $trim_fq2 && ! -s ${align_bam_r2}Chimeric.out.junction ]]
then
  STAR --runThreadN 16 \
             --genomeDir $STAR_INDEX \
             --outSAMtype None \
             --readFilesIn $trim_fq2 \
             --readFilesCommand zcat \
             --outFileNamePrefix $align_bam_r2 \
             --outReadsUnmapped None \
             --outFilterMismatchNmax 999 \
             --outFilterMismatchNoverLmax 0.05 \
             --outSJfilterOverhangMin 15 15 15 15 \
             --alignSJoverhangMin 15 \
             --alignSJDBoverhangMin 15 \
             --seedSearchStartLmax 30 \
             --outFilterMultimapNmax 20 \
             --outFilterScoreMin 1 \
             --outFilterMatchNmin 1 \
             --chimSegmentMin 15 \
             --chimScoreMin 15 \
             --chimScoreSeparation 10 \
             --chimJunctionOverhangMin 15 \
             --outSAMmultNmax 1 \
             &> $log_r2
fi
gnilihzeux commented 3 months ago

After review log, I found my _tmp_DCC/tmp_circCount contains one line wired

chrM\t418\t15473\t-\tchrM\tchrM\tchrM\tchrM\tchrM\tchrM\tchrM\tchrM