Open sidizhao opened 1 year ago
Based on this log, isoCirc is trying to get "exon_number" from "/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.out", which is expected to be like this:
chr16 isocirc exon 66625 66738 . + . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr16 havana exon 66537 66738 . - . gene_id "ENSG00000234769"; gene_version "4"; transcript_id "ENST00000326592"; transcript_version "9"; exon_number "6"; gene_name "WASH4P"; gene_source "havana"; gene_biotype "protein_coding"; transcript_name "WASH4P-001"; transcript_source "havana"; transcript_biotype "protein_coding"; havana_transcript "OTTHUMT00000133175"; havana_transcript_version "2"; exon_id "ENSE00001686309"; exon_version "1"; tag "basic";
For gene_type, it is not required.
Here's that file:
output_0$ head isocirc.bed.exon.out chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 ensGene exon 233198939 233199030 . - . gene_id "ENSG00000135749"; transcript_id "ENST00000258229"; exon_number "20"; exon_id "ENST00000258229.20"; gene_name "ENSG00000135749"; chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 ensGene exon 233198939 233199030 . - . gene_id "ENSG00000135749"; transcript_id "ENST00000462233"; exon_number "19"; exon_id "ENST00000462233.19"; gene_name "ENSG00000135749"; chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 ensGene exon 233198939 233199030 . - . gene_id "ENSG00000135749"; transcript_id "ENST00000475463"; exon_number "8"; exon_id "ENST00000475463.8"; gene_name "ENSG00000135749"; chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 ensGene exon 233198939 233199030 . - . gene_id "ENSG00000135749"; transcript_id "ENST00000488780"; exon_number "7"; exon_id "ENST00000488780.7"; gene_name "ENSG00000135749"; chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 ensGene exon 233198939 233199030 . - . gene_id "ENSG00000135749"; transcript_id "ENST00000430153"; exon_number "7"; exon_id "ENST00000430153.7"; gene_name "ENSG00000135749"; chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 ensGene exon 233198967 233199030 . - . gene_id "ENSG00000135749"; transcript_id "ENST00000518351"; exon_number "1"; exon_id "ENST00000518351.1"; gene_name "ENSG00000135749"; chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 ensGene exon 233199020 233199030 . - . gene_id "ENSG00000135749"; transcript_id "ENST00000517808"; exon_number "1"; exon_id "ENST00000517808.1"; gene_name "ENSG00000135749"; chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 knownGene exon 233198939 233199030 . - . gene_id "A6NKB5"; transcript_id "ENST00000258229.14"; exon_number "20"; exon_id "ENST00000258229.14.20"; gene_name "A6NKB5"; chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 knownGene exon 233198939 233199030 . - . gene_id "H0YB15"; transcript_id "ENST00000462233.5"; exon_number "19"; exon_id "ENST00000462233.5.19"; gene_name "H0YB15"; chr1 isocirc exon 233198939 233199030 . - . gene_id "isocirc0"; transcript_id "isocirc0"; exon_number "1"; exon_id "isocirc0.1"; chr1 knownGene exon 233198939 233199030 . - . gene_id "H0YBF4"; transcript_id "ENST00000475463.6"; exon_number "8"; exon_id "ENST00000475463.6.8"; gene_name "H0YBF4";
Also, you mentioned this was run successfully until the recent update of v1.0.6. This is weird because nothing has been changed related to this part.
So I concatenated more custom entries to the GTF I used for this run, which I checked to have exon_number and exon_id in those entries as well. I am quite confused as well.
I see, but there must be several lines that have no "exon_number" so as to cause this error.
I think I know where the problem is. So I looked at the exact same circRNA, isocirc1 which was detected in both the old run and the new run. From the old isocirc.out:
isocirc1 chr10 34422057 34422259 NA NA NA 1 202 0 202 N NA False,False NA NA NA False False True +GT/AG True NNC FSM NA NA NA NA NA NA NA NA NA 1 m64043_220730_094118/30345339/ccs
In this case, it seems like it didn't really get a successful annotation but outputted the file anyway. Is there a particular reason why this new run isn't doing the same? I'm looking at the intermediate files of the new run:
$ cat isocirc.bed.ovlp.gene.out isocirc1 G009115 G009115 +
I think G009115 is one of the newer transcripts I added on, which means in the old run it wasn't getting recognized. By searching through the new annotation:
$ grep G009115 hg38_with_maher_lab_lncrna.gtf chr10 mitranscriptome gene 34417023 34459184 . + . gene_id "G009115"; gene_name "Unknown" chr10 mitranscriptome transcript 34417023 34436597 . + . transcript_id "T039819"; gene_id "G009115"; transcript_name "Unknown"; gene_name "Unknown" chr10 mitranscriptome exon 34417023 34417308 . + . transcript_id "T039819"; gene_id "G009115"; transcript_name "Unknown"; gene_name "Unknown" chr10 mitranscriptome transcript 34417023 34459184 . + . transcript_id "T039820"; gene_id "G009115"; transcript_name "Unknown"; gene_name "Unknown" chr10 mitranscriptome exon 34417023 34417308 . + . transcript_id "T039820"; gene_id "G009115"; transcript_name "Unknown"; gene_name "Unknown" chr10 mitranscriptome exon 34435248 34436597 . + . transcript_id "T039819"; gene_id "G009115"; transcript_name "Unknown"; gene_name "Unknown" chr10 mitranscriptome exon 34458756 34459184 . + . transcript_id "T039820"; gene_id "G009115"; transcript_name "Unknown"; gene_name "Unknown"
Do you think if I added exon number and exon id to these transcripts, it'll rectify the problem?
Yes, you should try that.
Resolved. Thank you! Is there a way to make the short read correct step a separate command? My computer cluster has a hard time running the entire process at once, so I typically end up having to break long_corrected.fa into smaller files and redo the isocirc command without the correction.
You can run lordec (or any long-read correction tool) separately if you have matched short-read data to correct the long-read data, and then use the corrected long reads as input.
Okay I'll keep that in mind.
Actually I just ran into some small problems. Since I've broken the fasta file up, some of the smaller files aren't finishing the job, whereas some of them did finish and produced results. Here's one example, and it seems that it just gets cut off after [read_wise_eval] started. Is this normal?
Matplotlib created a temporary config/cache directory at /tmp/995513.tmpdir/matplotlib-gp59czes because the default path (/home/s.zhao/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
== 11:25:49-Feb-10-2023 == [check_dependencies] Checking dependencies ...
== 11:25:49-Feb-10-2023 == [check_dependencies] Checking dependencies done!
== 11:25:49-Feb-10-2023 == [Tandem-Repeats-Finder] Finding tandem repeats with TRF ...
== 11:25:50-Feb-10-2023 == [Tandem Repeats Finder] trf409.legacylinux64 /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/hct116_FAR20705_nanopore_long_corrected.16.fa 2 7 7 80 10 100 2000 -h -ngs > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/trf.out
== 13:09:15-Feb-10-2023 == [Tandem-Repeats-Finder] Finding tandem repeats with TRF done!
== 13:09:15-Feb-10-2023 == [Mapping] Mapping consensus sequence to genome ...
== 13:09:15-Feb-10-2023 == [Mapping] minimap2 -ax splice -ub --MD --eqx /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/hct116_pacbio/annotation/all-chrs.fa /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/cons.fa -t 1 > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/cons.fa.sam
[M::mm_idx_gen::107.0280.99] collected minimizers
[M::mm_idx_gen::188.9410.99] sorted minimizers
[M::main::188.9490.99] loaded/built the index for 455 target sequence(s)
[M::mm_mapopt_update::192.5350.99] mid_occ = 792
[M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 455
[M::mm_idx_stat::194.8580.99] distinct minimizers: 167291034 (34.68% are singletons); average occurrences: 6.239; average spacing: 3.075
[M::worker_pipeline::640.6560.99] mapped 101445 sequences
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -ax splice -ub --MD --eqx -t 1 /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/hct116_pacbio/annotation/all-chrs.fa /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/cons.fa
[M::main] Real time: 642.341 sec; CPU: 636.824 sec; Peak RSS: 18.981 GB
== 13:19:57-Feb-10-2023 == [Mapping] Mapping consensus sequence to genome done!
== 13:19:57-Feb-10-2023 == [Classifying] Classifying consensus alignment ...
== 13:19:57-Feb-10-2023 == [classify_bam_core] Processing /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/cons.fa.sam ...
== 13:19:59-Feb-10-2023 == [classify_bam_core] 100000 BAM records done ...
== 13:20:01-Feb-10-2023 == [classify_bam_core] Processing /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/cons.fa.sam done.
== 13:20:01-Feb-10-2023 == [Classifying] Classifying consensus alignment done!
== 13:20:03-Feb-10-2023 == [gtfToGenePred] gtfToGenePred -ignoreGroupsWithoutExons /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.with.exon.id.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/hg38_with_maher_lab_lncrna.with.exon.id.gtf.gene_pred
== 13:20:27-Feb-10-2023 == [genePredToBed] genePredToBed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/hg38_with_maher_lab_lncrna.with.exon.id.gtf.gene_pred /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/hg38_with_maher_lab_lncrna.with.exon.id.gtf.bed
== 13:20:30-Feb-10-2023 == [get_transcript_from_gtf] Loading transcript from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.with.exon.id.gtf ...
== 13:20:44-Feb-10-2023 == [get_transcript_from_gtf] Loading transcript from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.with.exon.id.gtf done!
== 13:20:44-Feb-10-2023 == [get_splice_site_from_bed12] Loading splice site from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/hg38_with_maher_lab_lncrna.with.exon.id.gtf.bed ...
[E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/high.bam'
== 13:20:52-Feb-10-2023 == [get_splice_site_from_bed12] Loading splice site from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/hg38_with_maher_lab_lncrna.with.exon.id.gtf.bed done!
== 13:20:52-Feb-10-2023 == [get_splice_junction_from_bed12] Loading splice junction from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/hg38_with_maher_lab_lncrna.with.exon.id.gtf.bed ...
[E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/high.bam'
== 13:20:58-Feb-10-2023 == [get_splice_junction_from_bed12] Loading splice junction from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/hg38_with_maher_lab_lncrna.with.exon.id.gtf.bed done!
== 13:20:58-Feb-10-2023 == [get_exon_from_bed12] Loading exon from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/hg38_with_maher_lab_lncrna.with.exon.id.gtf.bed ...
[E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/high.bam'
== 13:21:06-Feb-10-2023 == [get_exon_from_bed12] Loading exon from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/hg38_with_maher_lab_lncrna.with.exon.id.gtf.bed done!
== 13:21:06-Feb-10-2023 == [get_back_splice_junction_from_bed] Loading splice junction from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/HCT116_short_read_annotation.bed ...
[E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/high.bam'
== 13:21:06-Feb-10-2023 == [get_back_splice_junction_from_bed] Loading splice junction from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/HCT116_short_read_annotation.bed done!
[E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/high.bam'
[E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_cell_lines/hct116_FAR20705_nanopore/isocirc_output_short_read/output_16/low.bam'
== 13:21:08-Feb-10-2023 == [read_wise_eval] Generating read-wise evaluation result ...
Traceback (most recent call last):
File "/usr/local/bin/miniconda3/bin/isocirc", line 219, in
This actually looks very weird. Can you upload your data here? Both long reads and annotation file. So that I can try to track this error.
Please merge the gtf file as github has size limit. Thank you for being so patient with me!
github_debug00.gtf.gz github_debug01.gtf.gz github_debug06.gtf.gz github_debug05.gtf.gz github_debug04.gtf.gz github_debug03.gtf.gz github_debug02.gtf.gz
I tried again with the newest push pip install isocirc==1.0.6a0 and the same error persists for this file.
Hi, just to follow up on this issue. Were you able to take a look at what could've potentially trigged this error?
Which circRNA bed file did you use as input?
I don't see the error msg with this command:
isocirc /home/gaoy1/sdata/isocirc_debug/hct116_FAR20705_nanopore_long_corrected.14.fa /home/gaoy1/data/genome/hg38/hg38.fa /home/gaoy1/sdata/isocirc_debug/debug.gtf /home/gaoy1/sdata/isocirc_debug/HCT116_short_read_annotation.bed /home/gaoy1/sdata/isocirc_debug/output -t32
Seems very weird.
Yeah I don't quite understand why it would generate an error because other parts of the fasta file have successfully completed running. Do you have an inkling of why that specific "UnboundLocalError: local variable 'op' referenced before assignment" would happen? In the meantime I will also try to ask the IT people maintaining the cluster and see if it's on our end.
Can you try to re-install the isocirc from the latest source (not the pip install)? And re-run it on this dataset. I added some error msg related to this error.
Alright I'll get back to you. I've been running it on a docker image I built. Will change pip to git and try again.
== 22:04:10-Feb-15-2023 == [read_wise_eval] Generating read-wise evaluation result ... == 22:04:10-Feb-15-2023 == [get_cigar_from_pairwise_res] Unexpected alignment string: target TCATAAAACGTTACTTAAAA 0.
It now shows this.
I tried to look for "TCATAAAACGTTACTTAAAA" in any of the intermediate files and it's not showing up.
Can you try pip show biopython
?
Seems like you are using the old version of biopython.
Name: biopython Version: 1.78
The new version requires biopython >= 1.79. This is why the error come up.
Should I specify that when I build the docker? The only ones I had installed other than isocirc were bedtools and minimap2.
I am not familiar with docker. Usually, there should be no problem since it is listed in the requirement.txt. You can try to re-install every thing.
Yeah I think the docker image is still pulling the local 1.78 version for some reason. I'm working on fixing that. Hopefully this will fix everything.
Hi there,
I've been using isoCirc successfully for a while now but this week after installing v1.0.6, it seems to be generating some errors towards the end of the process. I am able to get isocirc.bed output but not isocirc.out. Here's the full error:
Matplotlib created a temporary config/cache directory at /tmp/977627.tmpdir/matplotlib-z2gis1si because the default path (/home/s.zhao/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing. == 06:19:32-Feb-09-2023 == [check_dependencies] Checking dependencies ... == 06:19:32-Feb-09-2023 == [check_dependencies] Checking dependencies done! == 06:19:33-Feb-09-2023 == [Tandem-Repeats-Finder] Finding tandem repeats with TRF ... == 06:19:33-Feb-09-2023 == [Tandem Repeats Finder] trf409.legacylinux64 /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/722_primary_pacbio_long_corrected.0.fa 2 7 7 80 10 100 2000 -h -ngs > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/trf.out == 14:35:27-Feb-09-2023 == [Tandem-Repeats-Finder] Finding tandem repeats with TRF done! == 14:35:27-Feb-09-2023 == [Mapping] Mapping consensus sequence to genome ... == 14:35:27-Feb-09-2023 == [Mapping] minimap2 -ax splice -ub --MD --eqx /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/hct116_pacbio/annotation/all-chrs.fa /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/cons.fa -t 1 > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/cons.fa.sam [M::mm_idx_gen::135.0830.90] collected minimizers [M::mm_idx_gen::238.3200.89] sorted minimizers [M::main::238.4270.89] loaded/built the index for 455 target sequence(s) [M::mm_mapopt_update::243.5150.89] mid_occ = 792 [M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 455 [M::mm_idx_stat::245.9300.89] distinct minimizers: 167291034 (34.68% are singletons); average occurrences: 6.239; average spacing: 3.075 [M::worker_pipeline::1077.0800.85] mapped 94741 sequences [M::main] Version: 2.17-r941 [M::main] CMD: minimap2 -ax splice -ub --MD --eqx -t 1 /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/hct116_pacbio/annotation/all-chrs.fa /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/cons.fa [M::main] Real time: 1079.441 sec; CPU: 913.986 sec; Peak RSS: 18.981 GB == 14:53:27-Feb-09-2023 == [Mapping] Mapping consensus sequence to genome done! == 14:53:27-Feb-09-2023 == [Classifying] Classifying consensus alignment ... == 14:53:27-Feb-09-2023 == [classify_bam_core] Processing /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/cons.fa.sam ... == 14:53:31-Feb-09-2023 == [classify_bam_core] 100000 BAM records done ... == 14:53:33-Feb-09-2023 == [classify_bam_core] Processing /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/cons.fa.sam done. == 14:53:33-Feb-09-2023 == [Classifying] Classifying consensus alignment done! == 14:53:35-Feb-09-2023 == [gtfToGenePred] gtfToGenePred -ignoreGroupsWithoutExons /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.gene_pred == 14:55:19-Feb-09-2023 == [genePredToBed] genePredToBed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.gene_pred /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.bed == 14:56:10-Feb-09-2023 == [get_transcript_from_gtf] Loading transcript from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf ... == 14:56:31-Feb-09-2023 == [get_transcript_from_gtf] Loading transcript from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf done! == 14:56:31-Feb-09-2023 == [get_splice_site_from_bed12] Loading splice site from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.bed ... [E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/high.bam' == 14:56:41-Feb-09-2023 == [get_splice_site_from_bed12] Loading splice site from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.bed done! == 14:56:41-Feb-09-2023 == [get_splice_junction_from_bed12] Loading splice junction from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.bed ... [E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/high.bam' == 14:56:49-Feb-09-2023 == [get_splice_junction_from_bed12] Loading splice junction from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.bed done! == 14:56:49-Feb-09-2023 == [get_exon_from_bed12] Loading exon from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.bed ... [E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/high.bam' == 14:56:57-Feb-09-2023 == [get_exon_from_bed12] Loading exon from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.bed done! == 14:56:57-Feb-09-2023 == [get_back_splice_junction_from_bed] Loading splice junction from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/patient_722_short_read_annotation.bed ... [E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/high.bam' == 14:56:57-Feb-09-2023 == [get_back_splice_junction_from_bed] Loading splice junction from /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/patient_722_short_read_annotation.bed done! [E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/high.bam' [E::idx_find_and_load] Could not retrieve index file for '/storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/low.bam' == 14:56:58-Feb-09-2023 == [read_wise_eval] Generating read-wise evaluation result ... == 14:58:02-Feb-09-2023 == [read_wise_eval] Generating read-wise evaluation result done! == 14:58:02-Feb-09-2023 == [filter_circRNA_read] Filtering back-splice-junctions ... == 14:58:03-Feb-09-2023 == [filter_circRNA_read] Filtering back-splice-junctions done! == 14:58:03-Feb-09-2023 == [rescue_reads] Rescuing reads using reliable back-splice-junctions ... == 14:58:03-Feb-09-2023 == [rescue_reads] Rescuing reads using reliable back-splice-junctions done! == 14:58:03-Feb-09-2023 == [uniq_isoform_with_unsorted_coors] Generating isoform-wise evaluation result ... == 14:58:03-Feb-09-2023 == [uniq_isoform_with_unsorted_coors] Generating isoform-wise evaluation result done! == 14:58:13-Feb-09-2023 == [bed2exonGtf] bed2exonGtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf == 14:58:18-Feb-09-2023 == [exonGtf] awk -v OFS="\t" '($3=="exon"){print}' /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.exon.gtf == 14:58:29-Feb-09-2023 == [gtf2bed] awk -v OFS="\t" '($3=="gene"){print $1,$4-1,$5}' /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.gene.bed == 14:58:43-Feb-09-2023 == [gtf2bed] awk -v OFS="\t" '($3=="CDS"){print}' /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.cds.gtf == 14:58:58-Feb-09-2023 == [gtf2bed] awk -v OFS="\t" '($3=="UTR" || $3=="five_prime_utr" || $3=="three_prime_utr"){print}' /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.utr.gtf == 14:59:11-Feb-09-2023 == [gtf2bed] awk -v OFS="\t" '($3=="exon" && ($0 ~ /gene_biotype "lincRNA"/ || $0 ~ /gene_type "lincRNA"/)){print}' /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.lincRNA.gtf == 14:59:29-Feb-09-2023 == [gtf2bed] awk -v OFS="\t" '($3=="exon" && ($0 ~ /gene_biotype "antisense"/ || $0 ~ /gene_type "antisense"/)){print}' /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.antisense.gtf == 14:59:51-Feb-09-2023 == [gtf2bed] awk -v OFS="\t" '($3=="exon" && ($0 ~ /gene_biotype "rRNA"/ || $0 ~ /gene_type "rRNA"/)){print}' /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.rRNA.gtf == 15:01:14-Feb-09-2023 == [bed2exonGtf] bed2exonGtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.five.site.bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.five.site.exon.gtf == 15:01:23-Feb-09-2023 == [bed2exonGtf] bed2exonGtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.three.site.bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.three.site.exon.gtf == 15:01:26-Feb-09-2023 == [bed2exonGtf] bed2exonGtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.five.site.bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.five.site.exon.gtf == 15:01:47-Feb-09-2023 == [bed2exonGtf] bed2exonGtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.three.site.bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.three.site.exon.gtf == 15:02:13-Feb-09-2023 == [itst_gtf_gtf] itst_gtf_gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.five.site.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.five.site.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.five.site.gene.out == 15:02:18-Feb-09-2023 == [itst_gtf_gtf] itst_gtf_gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.three.site.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.three.site.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.three.site.gene.out == 15:02:22-Feb-09-2023 == [gtf2gene] gtf2gene /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/annotation/hg38_with_maher_lab_lncrna.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.ovlp.gene.out == 15:02:46-Feb-09-2023 == [itst_gtf_bed] itst_gtf_bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.cds.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.CDS.out == 15:02:52-Feb-09-2023 == [itst_gtf_bed] itst_gtf_bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.utr.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.UTR.out == 15:02:53-Feb-09-2023 == [itst_gtf_bed] itst_gtf_bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.lincRNA.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.lincRNA.out == 15:02:54-Feb-09-2023 == [itst_gtf_bed] itst_gtf_bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.antisense.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.antisense.out == 15:02:55-Feb-09-2023 == [itst_gtf_bed] itst_gtf_bed /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.rRNA.gtf /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.rRNA.out == 15:02:56-Feb-09-2023 == [itst_intron] bedtools intersect -v -a /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf -b /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.bed -split > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.intron.out == 15:02:58-Feb-09-2023 == [itst_intergenic] bedtools intersect -v -a /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf -b /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.gene.bed > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.intergenic.out == 15:02:59-Feb-09-2023 == [itst_exon] bedtools intersect -a /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.gtf -b /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/hg38_with_maher_lab_lncrna.gtf.exon.gtf -wa -wb > /storage1/fs1/christophermaher/Active/maherlab/sidizhao/circ_rna/long_read/isocirc_rerun/crc_matched_patients/722_primary_pacbio/isocirc_output_short_read/output_0/isocirc.bed.exon.out == 15:03:08-Feb-09-2023 == [get_block_anno] No "exon_number" found in record.
I checked back at the previous successful runs' error logs and there wasn't a step [get_block_anno] in it, it just goes from [itst_exon] to [output_isoform_eval]. I know my GTF file does have exon_number and exon_id for most of the transcripts (I don't know if I need to clean up my GTF more? It used to work fine though.)
Here's a few lines:
chr1 ensGene exon 11869 12227 . + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "1"; exon_id "ENST00000456328.1"; gene_name "ENSG00000223972"; chr1 ensGene exon 12613 12721 . + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "2"; exon_id "ENST00000456328.2"; gene_name "ENSG00000223972"; chr1 ensGene exon 13221 14409 . + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "3"; exon_id "ENST00000456328.3"; gene_name "ENSG00000223972";
Also, I notice that some of the steps in [gtf2bed] require "gene_type" or "gene_biotype" in them. Should the GTF file include the biotypes as well?