Open braveagle0 opened 3 years ago
Do you have the log information file? That will help me find out why.
Yan
[M::mm_idx_gen::628.7940.62] collected minimizers
[M::mm_idx_gen::703.7370.70] sorted minimizers
[M::main::703.7400.70] loaded/built the index for 194 target sequence(s)
[M::mm_mapopt_update::714.0950.70] mid_occ = 765
[M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 194
[M::mm_idx_stat::716.1330.70] distinct minimizers: 167225302 (35.46% are singletons); average occurrences: 6.030; average spacing: 3.074
[M::worker_pipeline::2221.0214.31] mapped 188232 sequences
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -ax splice -ub --MD --eqx -t 8 /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.dna.primary_assembly.fa output2/cons.fa
[M::main] Real time: 2221.478 sec; CPU: 9580.517 sec; Peak RSS: 20.286 GB
[E::idx_find_and_load] Could not retrieve index file for 'output2/high.bam'
[E::idx_find_and_load] Could not retrieve index file for 'output2/high.bam'
[E::idx_find_and_load] Could not retrieve index file for 'output2/high.bam'
[E::idx_find_and_load] Could not retrieve index file for 'output2/high.bam'
== 16:43:39-Jul-01-2021 == [check_dependencies] Checking dependencies ...
== 16:43:41-Jul-01-2021 == [check_dependencies] Checking dependencies done!
== 16:43:41-Jul-01-2021 == [Tandem-Repeats-Finder] Finding tandem repeats with TRF ...
== 16:43:41-Jul-01-2021 == [fxtools] fxtools sx TotalRNAonly.fa 8 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/
== 16:44:30-Jul-01-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.2 2 7 7 80 10 100 2000 -h -ngs > output2/trf.out.2; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.2
== 16:44:30-Jul-01-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.1 2 7 7 80 10 100 2000 -h -ngs > output2/trf.out.1; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.1
== 16:44:30-Jul-01-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.3 2 7 7 80 10 100 2000 -h -ngs > output2/trf.out.3; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.3
== 16:44:30-Jul-01-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.4 2 7 7 80 10 100 2000 -h -ngs > output2/trf.out.4; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.4
== 16:44:30-Jul-01-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.5 2 7 7 80 10 100 2000 -h -ngs > output2/trf.out.5; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.5
== 16:44:30-Jul-01-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.6 2 7 7 80 10 100 2000 -h -ngs > output2/trf.out.6; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.6
== 16:44:30-Jul-01-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.7 2 7 7 80 10 100 2000 -h -ngs > output2/trf.out.7; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.7
== 16:44:30-Jul-01-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.8 2 7 7 80 10 100 2000 -h -ngs > output2/trf.out.8; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.8
== 17:10:51-Jul-01-2021 == [Tandem Repeats Finder] cat output2/trf.out.1 >> output2/trf.out; rm output2/trf.out.1
== 17:10:59-Jul-01-2021 == [Tandem Repeats Finder] cat output2/trf.out.2 >> output2/trf.out; rm output2/trf.out.2
== 17:11:06-Jul-01-2021 == [Tandem Repeats Finder] cat output2/trf.out.3 >> output2/trf.out; rm output2/trf.out.3
== 17:11:10-Jul-01-2021 == [Tandem Repeats Finder] cat output2/trf.out.4 >> output2/trf.out; rm output2/trf.out.4
== 17:11:15-Jul-01-2021 == [Tandem Repeats Finder] cat output2/trf.out.5 >> output2/trf.out; rm output2/trf.out.5
== 17:11:18-Jul-01-2021 == [Tandem Repeats Finder] cat output2/trf.out.6 >> output2/trf.out; rm output2/trf.out.6
== 17:11:21-Jul-01-2021 == [Tandem Repeats Finder] cat output2/trf.out.7 >> output2/trf.out; rm output2/trf.out.7
== 17:11:28-Jul-01-2021 == [Tandem Repeats Finder] cat output2/trf.out.8 >> output2/trf.out; rm output2/trf.out.8
== 17:11:30-Jul-01-2021 == [fxtools] fxtools lp TotalRNAonly.fa > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/TotalRNAonly.fa.len 2> /dev/null
== 17:15:38-Jul-01-2021 == [Tandem-Repeats-Finder] Finding tandem repeats with TRF done!
== 17:15:38-Jul-01-2021 == [Mapping] Mapping consensus sequence to genome ...
== 17:15:38-Jul-01-2021 == [Mapping] minimap2 -ax splice -ub --MD --eqx /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.dna.primary_assembly.fa output2/cons.fa -t 8 > output2/cons.fa.sam
== 17:52:42-Jul-01-2021 == [Mapping] Mapping consensus sequence to genome done!
== 17:52:42-Jul-01-2021 == [Classifying] Classifying consensus alignment ...
== 17:52:42-Jul-01-2021 == [classify_bam_core] Processing output2/cons.fa.sam ...
== 17:52:53-Jul-01-2021 == [classify_bam_core] 100000 BAM records done ...
== 17:53:05-Jul-01-2021 == [classify_bam_core] 200000 BAM records done ...
== 17:53:16-Jul-01-2021 == [classify_bam_core] 300000 BAM records done ...
== 17:53:30-Jul-01-2021 == [classify_bam_core] 400000 BAM records done ...
== 17:53:46-Jul-01-2021 == [classify_bam_core] 500000 BAM records done ...
== 17:53:53-Jul-01-2021 == [classify_bam_core] Processing output2/cons.fa.sam done.
== 17:53:53-Jul-01-2021 == [Classifying] Classifying consensus alignment done!
== 17:54:06-Jul-01-2021 == [gtfToGenePred] gtfToGenePred -genePredExt -ignoreGroupsWithoutExons /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.gene_pred
== 17:55:06-Jul-01-2021 == [genePredToBed] genePredToBed /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.gene_pred /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.bed
== 17:55:08-Jul-01-2021 == [get_transcript_from_bed12] Loading transcript from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.gene_pred ...
== 17:55:24-Jul-01-2021 == [get_transcript_from_gene_pred] Loading transcript from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.gene_pred done!
== 17:55:24-Jul-01-2021 == [get_splice_site_from_bed12] Loading splice site from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.bed ...
== 17:55:37-Jul-01-2021 == [get_splice_site_from_bed12] Loading splice site from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.bed done!
== 17:55:37-Jul-01-2021 == [get_splice_junction_from_bed12] Loading splice junction from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.bed ...
== 17:55:46-Jul-01-2021 == [get_splice_junction_from_bed12] Loading splice junction from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.bed done!
== 17:55:46-Jul-01-2021 == [get_exon_from_bed12] Loading exon from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.bed ...
== 17:55:55-Jul-01-2021 == [get_exon_from_bed12] Loading exon from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output2/Homo_sapiens.GRCh38.104.gtf.bed done!
== 17:55:55-Jul-01-2021 == [get_back_splice_junction_from_bed] Loading splice junction from /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/human_circRNA_v2.0.bed ...
Traceback (most recent call last):
File "/home/guans/bin/anaconda3/bin/isocirc", line 219, in
Please see the log above and help! Thanks a lot!
Based on the error message
ValueError: invalid literal for int() with base 10: 'Start'
Your bed file human_circRNA_v2.0.bed
may have a header line and it should be removed.
I removed the head and ran the isocirc again. Here is the log info
"[M::mm_idx_gen::130.0861.12] collected minimizers
[M::mm_idx_gen::140.7591.61] sorted minimizers
[M::main::140.7631.61] loaded/built the index for 194 target sequence(s)
[M::mm_mapopt_update::143.2841.60] mid_occ = 765
[M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 194
[M::mm_idx_stat::145.2681.59] distinct minimizers: 167225302 (35.46% are singletons); average occurrences: 6.030; average spacing: 3.074
[M::worker_pipeline::1236.4867.06] mapped 188232 sequences
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -ax splice -ub --MD --eqx -t 8 /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.dna.primary_assembly.fa output_no_head/cons.fa
[M::main] Real time: 1236.718 sec; CPU: 8725.111 sec; Peak RSS: 22.318 GB
[E::idx_find_and_load] Could not retrieve index file for 'output_no_head/high.bam'
[E::idx_find_and_load] Could not retrieve index file for 'output_no_head/high.bam'
[E::idx_find_and_load] Could not retrieve index file for 'output_no_head/high.bam'
[E::idx_find_and_load] Could not retrieve index file for 'output_no_head/high.bam'
[E::idx_find_and_load] Could not retrieve index file for 'output_no_head/high.bam'
[E::idx_find_and_load] Could not retrieve index file for 'output_no_head/low.bam'
== 10:51:32-Jul-12-2021 == [check_dependencies] Checking dependencies ...
== 10:51:33-Jul-12-2021 == [check_dependencies] Checking dependencies done!
== 10:51:33-Jul-12-2021 == [Tandem-Repeats-Finder] Finding tandem repeats with TRF ...
== 10:51:33-Jul-12-2021 == [fxtools] fxtools sx TotalRNAonly.fa 8 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/
== 10:52:03-Jul-12-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.1 2 7 7 80 10 100 2000 -h -ngs > output_no_head/trf.out.1; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.1
== 10:52:03-Jul-12-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.2 2 7 7 80 10 100 2000 -h -ngs > output_no_head/trf.out.2; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.2
== 10:52:03-Jul-12-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.3 2 7 7 80 10 100 2000 -h -ngs > output_no_head/trf.out.3; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.3
== 10:52:03-Jul-12-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.4 2 7 7 80 10 100 2000 -h -ngs > output_no_head/trf.out.4; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.4
== 10:52:03-Jul-12-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.5 2 7 7 80 10 100 2000 -h -ngs > output_no_head/trf.out.5; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.5
== 10:52:03-Jul-12-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.6 2 7 7 80 10 100 2000 -h -ngs > output_no_head/trf.out.6; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.6
== 10:52:03-Jul-12-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.7 2 7 7 80 10 100 2000 -h -ngs > output_no_head/trf.out.7; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.7
== 10:52:03-Jul-12-2021 == [Tandem Repeats Finder] trf409.legacylinux64 /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.8 2 7 7 80 10 100 2000 -h -ngs > output_no_head/trf.out.8; rm /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.8
== 11:17:41-Jul-12-2021 == [Tandem Repeats Finder] cat output_no_head/trf.out.1 >> output_no_head/trf.out; rm output_no_head/trf.out.1
== 11:17:46-Jul-12-2021 == [Tandem Repeats Finder] cat output_no_head/trf.out.2 >> output_no_head/trf.out; rm output_no_head/trf.out.2
== 11:17:50-Jul-12-2021 == [Tandem Repeats Finder] cat output_no_head/trf.out.3 >> output_no_head/trf.out; rm output_no_head/trf.out.3
== 11:17:51-Jul-12-2021 == [Tandem Repeats Finder] cat output_no_head/trf.out.4 >> output_no_head/trf.out; rm output_no_head/trf.out.4
== 11:17:53-Jul-12-2021 == [Tandem Repeats Finder] cat output_no_head/trf.out.5 >> output_no_head/trf.out; rm output_no_head/trf.out.5
== 11:17:56-Jul-12-2021 == [Tandem Repeats Finder] cat output_no_head/trf.out.6 >> output_no_head/trf.out; rm output_no_head/trf.out.6
== 11:18:00-Jul-12-2021 == [Tandem Repeats Finder] cat output_no_head/trf.out.7 >> output_no_head/trf.out; rm output_no_head/trf.out.7
== 11:18:02-Jul-12-2021 == [Tandem Repeats Finder] cat output_no_head/trf.out.8 >> output_no_head/trf.out; rm output_no_head/trf.out.8
== 11:18:04-Jul-12-2021 == [fxtools] fxtools lp TotalRNAonly.fa > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/TotalRNAonly.fa.len 2> /dev/null
== 11:22:30-Jul-12-2021 == [Tandem-Repeats-Finder] Finding tandem repeats with TRF done!
== 11:22:30-Jul-12-2021 == [Mapping] Mapping consensus sequence to genome ...
== 11:22:30-Jul-12-2021 == [Mapping] minimap2 -ax splice -ub --MD --eqx /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.dna.primary_assembly.fa output_no_head/cons.fa -t 8 > output_no_head/cons.fa.sam
== 11:43:09-Jul-12-2021 == [Mapping] Mapping consensus sequence to genome done!
== 11:43:09-Jul-12-2021 == [Classifying] Classifying consensus alignment ...
== 11:43:09-Jul-12-2021 == [classify_bam_core] Processing output_no_head/cons.fa.sam ...
== 11:43:30-Jul-12-2021 == [classify_bam_core] 100000 BAM records done ...
== 11:43:51-Jul-12-2021 == [classify_bam_core] 200000 BAM records done ...
== 11:44:12-Jul-12-2021 == [classify_bam_core] 300000 BAM records done ...
== 11:44:27-Jul-12-2021 == [classify_bam_core] 400000 BAM records done ...
== 11:44:44-Jul-12-2021 == [classify_bam_core] 500000 BAM records done ...
== 11:44:51-Jul-12-2021 == [classify_bam_core] Processing output_no_head/cons.fa.sam done.
== 11:44:51-Jul-12-2021 == [Classifying] Classifying consensus alignment done!
== 11:45:03-Jul-12-2021 == [gtfToGenePred] gtfToGenePred -genePredExt -ignoreGroupsWithoutExons /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.gene_pred
== 11:46:12-Jul-12-2021 == [genePredToBed] genePredToBed /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.gene_pred /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.bed
== 11:46:15-Jul-12-2021 == [get_transcript_from_bed12] Loading transcript from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.gene_pred ...
== 11:46:28-Jul-12-2021 == [get_transcript_from_gene_pred] Loading transcript from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.gene_pred done!
== 11:46:28-Jul-12-2021 == [get_splice_site_from_bed12] Loading splice site from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.bed ...
== 11:46:42-Jul-12-2021 == [get_splice_site_from_bed12] Loading splice site from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.bed done!
== 11:46:42-Jul-12-2021 == [get_splice_junction_from_bed12] Loading splice junction from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.bed ...
== 11:46:53-Jul-12-2021 == [get_splice_junction_from_bed12] Loading splice junction from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.bed done!
== 11:46:53-Jul-12-2021 == [get_exon_from_bed12] Loading exon from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.bed ...
== 11:47:03-Jul-12-2021 == [get_exon_from_bed12] Loading exon from /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.bed done!
== 11:47:03-Jul-12-2021 == [get_back_splice_junction_from_bed] Loading splice junction from /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/human_circRNA_v2.0.bed ...
== 11:47:11-Jul-12-2021 == [get_back_splice_junction_from_bed] Loading splice junction from /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/human_circRNA_v2.0.bed done!
== 11:47:13-Jul-12-2021 == [read_wise_eval] Generating read-wise evaluation result ...
== 11:58:03-Jul-12-2021 == [high_quality] 100000 high mapping quality BAM records have been processed ...
== 12:03:09-Jul-12-2021 == [read_wise_eval] Generating read-wise evaluation result done!
== 12:03:09-Jul-12-2021 == [filter_circRNA_read] Filtering back-splice-junctions ...
== 12:03:13-Jul-12-2021 == [filter_circRNA_read] Filtering back-splice-junctions done!
== 12:03:13-Jul-12-2021 == [rescue_reads] Rescuing reads using reliable back-splice-junctions ...
== 12:03:19-Jul-12-2021 == [rescue_reads] Rescuing reads using reliable back-splice-junctions done!
== 12:03:19-Jul-12-2021 == [uniq_isoform_with_unsorted_coors] Generating isoform-wise evaluation result ...
== 12:03:19-Jul-12-2021 == [uniq_isoform_with_unsorted_coors] Generating isoform-wise evaluation result done!
== 12:03:19-Jul-12-2021 == [bed2exonGtf] bed2exonGtf output_no_head/isocirc.bed output_no_head/isocirc.bed.exon.gtf
== 12:03:20-Jul-12-2021 == [exonGtf] awk -v OFS="\t" '($3=="exon"){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.exon.gtf
== 12:03:46-Jul-12-2021 == [gtf2bed] awk -v OFS="\t" '($3=="gene"){print $1,$4-1,$5}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.gene.bed
== 12:04:04-Jul-12-2021 == [gtf2bed] awk -v OFS="\t" '($3=="CDS"){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.cds.gtf
== 12:04:23-Jul-12-2021 == [gtf2bed] awk -v OFS="\t" '($3=="UTR" || $3=="five_prime_utr" || $3=="three_prime_utr"){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.utr.gtf
== 12:04:40-Jul-12-2021 == [gtf2bed] awk -v OFS="\t" '($3=="exon" && ($0 ~ /gene_biotype "lincRNA"/ || $0 ~ /gene_type "lincRNA"/)){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.lincRNA.gtf
== 12:05:10-Jul-12-2021 == [gtf2bed] awk -v OFS="\t" '($3=="exon" && ($0 ~ /gene_biotype "antisense"/ || $0 ~ /gene_type "antisense"/)){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.antisense.gtf
== 12:05:34-Jul-12-2021 == [gtf2bed] awk -v OFS="\t" '($3=="exon" && ($0 ~ /gene_biotype "rRNA"/ || $0 ~ /gene_type "rRNA"/)){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.rRNA.gtf
== 12:07:43-Jul-12-2021 == [bed2exonGtf] bed2exonGtf output_no_head/isocirc.bed.five.site.bed output_no_head/isocirc.bed.five.site.exon.gtf
== 12:07:45-Jul-12-2021 == [bed2exonGtf] bed2exonGtf output_no_head/isocirc.bed.three.site.bed output_no_head/isocirc.bed.three.site.exon.gtf
== 12:07:46-Jul-12-2021 == [bed2exonGtf] bed2exonGtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.five.site.bed /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.five.site.exon.gtf
== 12:07:52-Jul-12-2021 == [bed2exonGtf] bed2exonGtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.three.site.bed /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.three.site.exon.gtf
== 12:07:56-Jul-12-2021 == [itst_gtf_gtf] itst_gtf_gtf output_no_head/isocirc.bed.five.site.exon.gtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.five.site.exon.gtf output_no_head/isocirc.bed.five.site.gene.out
== 12:08:03-Jul-12-2021 == [itst_gtf_gtf] itst_gtf_gtf output_no_head/isocirc.bed.three.site.exon.gtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_no_head/Homo_sapiens.GRCh38.104.gtf.three.site.exon.gtf output_no_head/isocirc.bed.three.site.gene.out
== 12:08:09-Jul-12-2021 == [gtf2gene] gtf2gene output_no_head/isocirc.bed.exon.gtf /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf output_no_head/isocirc.bed.ovlp.gene.out
Traceback (most recent call last):
File "/home/guans/bin/anaconda3/bin/isocirc", line 219, in
Thanks!
Can you show me a few lines of output_no_head/isocirc.bed.ovlp.gene.out
?
Here is a few lines of the file: "isocirc0 ENSG00000230021 - isocirc10000 ENSG00000204390 HSPA1L - isocirc10001 ENSG00000204371 EHMT2 - isocirc10002 ENSG00000213676 ATF6B - isocirc10003 ENSG00000213676 ATF6B - isocirc10004 ENSG00000223501 VPS52 - isocirc10005 ENSG00000124493 GRM4 - isocirc10006 ENSG00000124493 GRM4 - isocirc10007 ENSG00000124493 GRM4 - isocirc10008 ENSG00000124493 GRM4 - isocirc10009 ENSG00000270800 RPS10-NUDT3 - isocirc10009 ENSG00000272325 NUDT3 - isocirc1000 ENSG00000143473 KCNH1 - isocirc1000 ENSG00000283952 - isocirc1000 ENSG00000284299 - isocirc10010 ENSG00000124507 PACSIN1 +"
Seems like some of the genes in your GTF file do not have a gene name.
Can you type in grep ENSG00000230021 Homo_sapiens.GRCh38.104.gtf
and paste the output here?
1 havana transcript 720053 724564 . - . gene_id "ENSG00000230021"; gene_version "10"; transcript_id "ENST00000447954"; transcrip t_version "2"; gene_source "havana"; gene_biotype "transcribed_processed_pseudog ene"; transcript_source "havana"; transcript_biotype "processed_transcript"; tra nscript_support_level "2 (assigned to previous version 1)"; 1 havana exon 724358 724564 . - . gene_id "ENSG000 00230021"; gene_version "10"; transcript_id "ENST00000447954"; transcript_versio n "2"; exon_number "1"; gene_source "havana"; gene_biotype "transcribed_processe d_pseudogene"; transcript_source "havana"; transcript_biotype "processed_transcr ipt"; exon_id "ENSE00001688006"; exon_version "2"; transcript_support_level "2 ( assigned to previous version 1)"; 1 havana exon 720053 720200 . - . gene_id "ENSG000 00230021"; gene_version "10"; transcript_id "ENST00000447954"; transcript_versio n "2"; exon_number "2"; gene_source "havana"; gene_biotype "transcribed_processe d_pseudogene"; transcript_source "havana"; transcript_biotype "processed_transcr ipt"; exon_id "ENSE00001675630"; exon_version "2"; transcript_support_level "2 ( assigned to previous version 1)"; [guans@login-0-1 GRCh38]$ grep ENSG00000230021 Homo_sapiens.GRCh38.104.gtf 1 havana gene 586071 827796 . - . gene_id "ENSG00000230021"; gene_version "10"; gene_source "havana"; gene_biotype "transcribed_processed_pseudogene"; 1 havana transcript 586071 612813 . - . gene_id "ENSG00000230021"; gene_version "10"; transcript_id "ENST00000634833"; transcript_version "2"; gene_source "havana"; gene_biotype "transcribed_processed_pseudogene"; transcript_source "havana"; transcript_biotype "processed_transcript"; tag "basic"; transcript_support_level "5 (assigned to previous version 1)"; 1 havana exon 612741 612813 . - . gene_id "ENSG00000230021"; gene_version "10"; transcript_id "ENST00000634833"; transcript_version "2"; exon_number "1"; gene_source "havana"; gene_biotype "transcribed_processed_pseudogene"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00003812707"; exon_version "1"; tag "basic"; transcript_support_level "5 (assigned to previous version 1)"; 1 havana exon 607955 608056 . - . gene_id "ENSG00000230021"; gene_version "10"; transcript_id "ENST00000634833"; transcript_version "2"; exon_number "2"; gene_source "havana"; gene_biotype "transcribed_processed_pseudogene"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00001718533"; exon_version "1"; tag "basic"; transcript_support_level "5 (assigned to previous version 1)";
I see. Your GTF file has no "gene_name" tags, this is why isoCirc met an error.
I just updated the related script. You can try the latest version of isoCirc (v1.0.4), it should work now.
I tried v1.0.4 and still encounter some errors.
"== 12:24:05-Jul-15-2021 == [read_wise_eval] Generating read-wise evaluation result ...
== 12:37:01-Jul-15-2021 == [high_quality] 100000 high mapping quality BAM records have been processed ...
== 12:43:43-Jul-15-2021 == [read_wise_eval] Generating read-wise evaluation result done!
== 12:43:43-Jul-15-2021 == [filter_circRNA_read] Filtering back-splice-junctions ...
== 12:43:47-Jul-15-2021 == [filter_circRNA_read] Filtering back-splice-junctions done!
== 12:43:47-Jul-15-2021 == [rescue_reads] Rescuing reads using reliable back-splice-junctions ...
== 12:43:52-Jul-15-2021 == [rescue_reads] Rescuing reads using reliable back-splice-junctions done!
== 12:43:52-Jul-15-2021 == [uniq_isoform_with_unsorted_coors] Generating isoform-wise evaluation result ...
== 12:43:53-Jul-15-2021 == [uniq_isoform_with_unsorted_coors] Generating isoform-wise evaluation result done!
== 12:43:53-Jul-15-2021 == [bed2exonGtf] bed2exonGtf output_104/isocirc.bed output_104/isocirc.bed.exon.gtf
== 12:43:56-Jul-15-2021 == [exonGtf] awk -v OFS="\t" '($3=="exon"){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.exon.gtf
== 12:44:39-Jul-15-2021 == [gtf2bed] awk -v OFS="\t" '($3=="gene"){print $1,$4-1,$5}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.gene.bed
== 12:45:26-Jul-15-2021 == [gtf2bed] awk -v OFS="\t" '($3=="CDS"){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.cds.gtf
== 12:46:09-Jul-15-2021 == [gtf2bed] awk -v OFS="\t" '($3=="UTR" || $3=="five_prime_utr" || $3=="three_prime_utr"){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.utr.gtf
== 12:46:53-Jul-15-2021 == [gtf2bed] awk -v OFS="\t" '($3=="exon" && ($0 ~ /gene_biotype "lincRNA"/ || $0 ~ /gene_type "lincRNA"/)){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.lincRNA.gtf
== 12:47:42-Jul-15-2021 == [gtf2bed] awk -v OFS="\t" '($3=="exon" && ($0 ~ /gene_biotype "antisense"/ || $0 ~ /gene_type "antisense"/)){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.antisense.gtf
== 12:48:30-Jul-15-2021 == [gtf2bed] awk -v OFS="\t" '($3=="exon" && ($0 ~ /gene_biotype "rRNA"/ || $0 ~ /gene_type "rRNA"/)){print}' /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf > /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.rRNA.gtf
== 12:55:15-Jul-15-2021 == [bed2exonGtf] bed2exonGtf output_104/isocirc.bed.five.site.bed output_104/isocirc.bed.five.site.exon.gtf
== 12:55:19-Jul-15-2021 == [bed2exonGtf] bed2exonGtf output_104/isocirc.bed.three.site.bed output_104/isocirc.bed.three.site.exon.gtf
== 12:55:22-Jul-15-2021 == [bed2exonGtf] bed2exonGtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.five.site.bed /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.five.site.exon.gtf
== 12:55:35-Jul-15-2021 == [bed2exonGtf] bed2exonGtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.three.site.bed /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.three.site.exon.gtf
== 12:55:46-Jul-15-2021 == [itst_gtf_gtf] itst_gtf_gtf output_104/isocirc.bed.five.site.exon.gtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.five.site.exon.gtf output_104/isocirc.bed.five.site.gene.out
== 12:56:12-Jul-15-2021 == [itst_gtf_gtf] itst_gtf_gtf output_104/isocirc.bed.three.site.exon.gtf /mnt/home2/guans/circRNA/Nanopore/RCA_totalRNAonly/fastq_pass/output_104/Homo_sapiens.GRCh38.104.gtf.three.site.exon.gtf output_104/isocirc.bed.three.site.gene.out
== 12:56:38-Jul-15-2021 == [gtf2gene] gtf2gene output_104/isocirc.bed.exon.gtf /mnt/home2/guans/ref_genome/hg38/CircRNA_reference/GRCh38/Homo_sapiens.GRCh38.104.gtf output_104/isocirc.bed.ovlp.gene.out
Traceback (most recent call last):
File "/home/guans/bin/anaconda3/bin/isocirc", line 219, in
Do you mind sharing with me where you downloaded your .fa, .gtf and .bed file? Thanks!
I tried to run isocirc with test data. It worked great! However, when I tried to run isocirc with my own data, it did not generate isocirc.out, isocirc_stats.out or isocirc.bed. I downloaded the fa data from ensembl (http://ftp.ensembl.org/pub/release-104/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz ) and the gtf file also from ensembl (http://ftp.ensembl.org/pub/release-104/gtf/homo_sapiens/Homo_sapiens.GRCh38.104.gtf.gz). The circRNA bed file was downloaded from http://circatlas.biols.ac.cn/.
The output file contains the following files: cons.fa cons.fa.sam high.bam Homo_sapiens.GRCh38.104.gtf.gene_pred TotalRNAonly.fa.len cons.fa.fai cons.info Homo_sapiens.GRCh38.104.gtf.bed low.bam trf.out
Thanks for help!