dieterich-lab / DCC

DCC uses output from the STAR read mapper to systematically detect back-splice junctions in next-generation sequencing data. DCC applies a series of filters and integrates data across replicate sets to arrive at a precise list of circRNA candidates.
https://dieterichlab.org/software/
GNU General Public License v3.0
36 stars 20 forks source link

IndexError CombineCounts.py #65

Closed gabee-chan closed 2 years ago

gabee-chan commented 5 years ago

Hi I am trying to run DCC and get this error.

load_entry_point('DCC==0.4.7', 'console_scripts', 'DCC')() File "build/bdist.linux-x86_64/egg/DCC/main.py", line 269, in main File "build/bdist.linux-x86_64/egg/DCC/CombineCounts.py", line 42, in comb_coor File "build/bdist.linux-x86_64/egg/DCC/CombineCounts.py", line 96, in sortBed IndexError: list index out of range

I installed DCC clonning the git, so I expect is the latest version

The command I use to run DCC is

DCC [STAR_output]Chimeric.out.junction -D -R Repeat.gtf -an hg19_ens.gtf -F -M -Nr 5 6 -fg -G -A hg19.fa

My numpy version is the '1.16.4'

Thanks before handed.

tjakobi commented 5 years ago

Hi @gabee-chan,

do you provide any BAM files via -B? You are trying to run host gene counting (-G) but I cannot see the BAM files required for that step. There might be a check for the argument BAM missing at that stage of DCC.

Cheers, Tobias

gabee-chan commented 5 years ago

Hi!

I added the -B parameter but I still get the same error.

DCC 0.4.7 started 4 CPU cores available, using 2 started circRNA detection from file SCZ_2016-793_BA46_60F_CAUCChimeric.out.junction => locating circRNAs (stranded mode) [SCZ_2016-793_BA46_60F_CAUCChimeric.out.junction] => sorting circRNAs (stranded mode) [SCZ_2016-793_BA46_60F_CAUCChimeric.out.junction] finished circRNA detection from file SCZ_2016-793_BA46_60F_CAUCChimeric.out.junction Combining individual circRNA read counts Traceback (most recent call last): File "/share/ClusterShare/biodata/contrib/gabrod/anaconda2/bin/DCC", line 11, in load_entry_point('DCC==0.4.7', 'console_scripts', 'DCC')() File "build/bdist.linux-x86_64/egg/DCC/main.py", line 269, in main File "build/bdist.linux-x86_64/egg/DCC/CombineCounts.py", line 42, in comb_coor File "build/bdist.linux-x86_64/egg/DCC/CombineCounts.py", line 96, in sortBed IndexError: list index out of range

The command I used is:

DCC [STAR_output]Chimeric.out.junction -D -R Repeat_hg19.gtf -an hg19_ens.gtf -F -M -Nr 5 6 -fg -G -A hg19.fa -B [file].bam

tjakobi commented 5 years ago

Dear @gabee-chan,

just to rule out that at some point no circRNAs are found, could you set the filter parameter to -Nr 2 2 and try again.

Cheers, Tobias

xiaowufeiying commented 4 years ago

I got the same IndexError

Output folder ./ already exists, reusing Temporary folder emp already exists, reusing DCC 0.4.8 started 6 CPU cores available, using 2 Please make sure that the read pairs have been mapped both, combined and on a per mate basis Collecting chimera information from mates-separate mapping WARNING: File ./circRNA/mate2, line 1 does not contain all features. WARNING: ./circRNA/mate2 is probably corrupt. WARNING: Offending line: /Users/tao/Desktop/circRNA/DCC-test/SRR2185851/mate2_mapping/Chimeric.out.junction Traceback (most recent call last): File "/Users/tao/.local/bin/DCC", line 11, in load_entry_point('DCC==0.4.8', 'console_scripts', 'DCC')() File "build/bdist.macosx-10.6-x86_64/egg/DCC/main.py", line 250, in main File "build/bdist.macosx-10.6-x86_64/egg/DCC/main.py", line 531, in fixall File "build/bdist.macosx-10.6-x86_64/egg/DCC/fix2chimera.py", line 92, in fixchimerics File "build/bdist.macosx-10.6-x86_64/egg/DCC/fix2chimera.py", line 63, in fixmate2 IndexError: list index out of range

The command I use: DCC ./circRNA/samplesheet -mt1 ./circRNA/mate1 -mt2 ./circRNA/mate2 -D -an ./data\ analysis/GenCode_PRI/gencode.vM25.primary_assembly.annotation.gtf -M -Nr 2 1 -fg -temp -ss -F -Pi -L 20 -A ./data\ analysis/GenCode_PRI/GRCm38.primary_assembly.genome.fa

tjakobi commented 4 years ago

Dear @gabee-chan,

could you please attach the file /Users/tao/Desktop/circRNA/DCC-test/SRR2185851/mate2_mapping/Chimeric.out.junction ? There seems to be something wrong with that file that breaks the parser.

The first 100 lines would probably also be okay if the file is too large.

Cheers, Tobias

xiaowufeiying commented 4 years ago

Dear @gabee-chan,

could you please attach the file /Users/tao/Desktop/circRNA/DCC-test/SRR2185851/mate2_mapping/Chimeric.out.junction ? There seems to be something wrong with that file that breaks the parser.

The first 100 lines would probably also be okay if the file is too large.

Cheers, Tobias

Chimeric.out.junction.100.txt Dear Tobias, Attached, please find the first 100 lines of the Chimeric.out.junction file. best

tjakobi commented 4 years ago

Dear @xiaowufeiying,

I see you used DCC version 0.4.7.

Could you try to directly install DCC from the master branch? There was a recent fix that might fix this issue.

Cheers, Tobias