Closed ceOSU closed 1 year ago
That sounds strange. Could you post the first 10 lines from the GTF and the first few lines when using head()
on your AnalyzedSwitch
object?
I ended up getting it to work by changing ignoreAfterPeriod to TRUE. But for posterity here is the head of the GTF and of my analyzed switch obejct.
head(GTFhead)
iso_ref gene_ref isoform_id gene_id condition_1 condition_2 gene_name
106168 isoComp_00000001 geneComp_00000001 ENST00000373020.9 ENSG00000000003.15 plaseholder1 plaseholder2 TSPAN6
106172 isoComp_00000002 geneComp_00000001 ENST00000494424.1 ENSG00000000003.15 plaseholder1 plaseholder2 TSPAN6
106171 isoComp_00000003 geneComp_00000001 ENST00000496771.5 ENSG00000000003.15 plaseholder1 plaseholder2 TSPAN6
106169 isoComp_00000004 geneComp_00000001 ENST00000612152.4 ENSG00000000003.15 plaseholder1 plaseholder2 TSPAN6
106170 isoComp_00000005 geneComp_00000001 ENST00000614008.4 ENSG00000000003.15 plaseholder1 plaseholder2 TSPAN6
106166 isoComp_00000006 geneComp_00000002 ENST00000373031.5 ENSG00000000005.6 plaseholder1 plaseholder2 TNMD
ref_gene_id gene_biotype iso_biotype class_code gene_overall_mean gene_value_1 gene_value_2
106168 TSPAN6 protein_coding protein_coding = 0 0 0
106172 TSPAN6 protein_coding protein_coding_CDS_not_defined = 0 0 0
106171 TSPAN6 protein_coding protein_coding_CDS_not_defined = 0 0 0
106169 TSPAN6 protein_coding protein_coding = 0 0 0
106170 TSPAN6 protein_coding protein_coding = 0 0 0
106166 TNMD protein_coding protein_coding = 0 0 0
gene_stderr_1 gene_stderr_2 gene_log2_fold_change gene_p_value gene_q_value iso_overall_mean iso_value_1
106168 NA NA 0 1 1 0 0
106172 NA NA 0 1 1 0 0
106171 NA NA 0 1 1 0 0
106169 NA NA 0 1 1 0 0
106170 NA NA 0 1 1 0 0
106166 NA NA 0 1 1 0 0
iso_value_2 iso_stderr_1 iso_stderr_2 iso_log2_fold_change iso_p_value iso_q_value IF_overall IF1 IF2 dIF
106168 0 NA NA 0 1 1 NA NA NA NA
106172 0 NA NA 0 1 1 NA NA NA NA
106171 0 NA NA 0 1 1 NA NA NA NA
106169 0 NA NA 0 1 1 NA NA NA NA
106170 0 NA NA 0 1 1 NA NA NA NA
106166 0 NA NA 0 1 1 NA NA NA NA
isoform_switch_q_value gene_switch_q_value PTC
106168 NA NA FALSE
106172 NA NA NA
106171 NA NA NA
106169 NA NA FALSE
106170 NA NA FALSE
106166 NA NA FALSE
head(AnalyzedSwitch)
iso_ref gene_ref isoform_id gene_id condition_1 condition_2 gene_name gene_biotype
85624 isoComp_00000020 geneComp_00000004 ENST00000367770 ENSG00000000457 KD WT SCYL3 NA
85625 isoComp_00000021 geneComp_00000004 ENST00000367771 ENSG00000000457 KD WT SCYL3 NA
85626 isoComp_00000022 geneComp_00000004 ENST00000367772 ENSG00000000457 KD WT SCYL3 NA
85627 isoComp_00000023 geneComp_00000004 ENST00000423670 ENSG00000000457 KD WT SCYL3 NA
85629 isoComp_00000025 geneComp_00000004 MSTRG.2315.2 ENSG00000000457 KD WT SCYL3 NA
85630 isoComp_00000026 geneComp_00000004 MSTRG.2315.3 ENSG00000000457 KD WT SCYL3 NA
iso_biotype gene_overall_mean gene_value_1 gene_value_2 gene_stderr_1 gene_stderr_2 gene_log2_fold_change
85624 NA 3.249689 2.953907 3.545471 0.0184389 0.1800441 0.2625402
85625 NA 3.249689 2.953907 3.545471 0.0184389 0.1800441 0.2625402
85626 NA 3.249689 2.953907 3.545471 0.0184389 0.1800441 0.2625402
85627 NA 3.249689 2.953907 3.545471 0.0184389 0.1800441 0.2625402
85629 NA 3.249689 2.953907 3.545471 0.0184389 0.1800441 0.2625402
85630 NA 3.249689 2.953907 3.545471 0.0184389 0.1800441 0.2625402
gene_q_value iso_overall_mean iso_value_1 iso_value_2 iso_stderr_1 iso_stderr_2 iso_log2_fold_change iso_q_value
85624 NA 0.1869044 0.37380874 0.000000000 0.37380874 0.000000000 -5.262315644 NA
85625 NA 0.8153609 0.81336773 0.817353996 0.45779004 0.068351602 0.006967842 NA
85626 NA 0.2935263 0.48205034 0.105002350 0.05046671 0.013941773 -2.097142589 NA
85627 NA 0.3537878 0.01020881 0.697366802 0.01020881 0.031190193 5.129402086 NA
85629 NA 0.5025713 0.99984745 0.005295231 0.98019742 0.005295231 -6.044911621 NA
85630 NA 1.0912258 0.27405838 1.908393210 0.20672966 0.302522729 2.755639083 NA
IF_overall IF1 IF2 dIF isoform_switch_q_value gene_switch_q_value
85624 0.062875 0.12575 0.00000 -0.12575 8.009326e-01 2.62214e-08
85625 0.253250 0.27440 0.23210 -0.04230 9.937467e-01 2.62214e-08
85626 0.096575 0.16330 0.02985 -0.13345 9.210517e-04 2.62214e-08
85627 0.100550 0.00345 0.19765 0.19420 2.622140e-08 2.62214e-08
85629 0.171050 0.34055 0.00155 -0.33900 5.943468e-01 2.62214e-08
85630 0.313850 0.09235 0.53535 0.44300 6.028529e-02 2.62214e-08
Thanks for letting me know. I'll add that info to the error message :)
I have been working with data that I first mapped with HISAT2 then did my alignment, merge, and quantification with StringTie. When I try to use addORFfromGTF I get an error saying there isn't any CDS information in the GTF. I'm using isoformSwitchAnalyzeR version 2.1.3.
Here's everything I've run in R up to this point:
The error message I get is:
Step 1 of 2: importing GTF (this may take a while)... Step 2 of 2: Adding ORF... Error in addORFfromGTF(AnalyzedSwitch, pathToGTF = "C:/Users/Caleb/OneDrive - The Ohio State University/BioinfoData/IsoformSwitch/Homo_sapiens.GRCh38.108.gtf") : No ORFs could be added to the switchAnalyzeRlist. Please ensure GTF file have CDS info (and that isoform ids match). Also note that ORF/CDS information cannot be found in the output from asemblers such as StringTie and Cufflinks.Instead use the official GTF that you downloaded from an official source such as Ensembl, Gencode, etc.
The GTF I'm trying to use is what I used for StringTie initially, and I got it straight from Ensembl. I've also tried the new version on Ensembl (Homo_sapiens.GRCH38.109.gtf), as well as the alternate versions on Ensembl and I get the same error.