gpertea / gffcompare

classify, merge, tracking and annotation of GFF files by comparing to a reference annotation GFF
MIT License
198 stars 32 forks source link

Comparing two different mouse strain gtf files #67

Open RNAseqer opened 2 years ago

RNAseqer commented 2 years ago

Hello,

I'm attempting take a look at differences in annotations within two mouse strains gtf. I did pull these directly from the Ensembl site.

Unfortunately the results look like the following:

= Summary for dataset: Mus_musculus_casteij.CAST_EiJ_v1.103.gff3

Query mRNAs : 102111 in 36818 loci (87154 multi-exon transcripts)

(14823 multi-transcript loci, ~2.8 transcripts per locus)

Reference mRNAs : 100958 in 37945 loci (85196 multi-exon)

Super-loci w/ reference transcripts: 4432

-----------------| Sensitivity | Precision |

    Base level:     2.3     |     2.3    |
    Exon level:     0.0     |     0.0    |
  Intron level:     0.0     |     0.0    |

Intron chain level: 0.0 | 0.0 | Transcript level: 0.0 | 0.0 | Locus level: 0.0 | 0.0 |

 Matching intron chains:       0
   Matching transcripts:       6
          Matching loci:       6

      Missed exons:  346812/361584  ( 95.9%)
       Novel exons:  350935/365703  ( 96.0%)
    Missed introns:  209896/244471  ( 85.9%)
     Novel introns:  214789/248810  ( 86.3%)
       Missed loci:   32672/37945   ( 86.1%)
        Novel loci:   31683/36818   ( 86.1%)

Total union super-loci across all input datasets: 36115 102111 out of 102111 consensus transcripts written in cast_attempt.annotated.gtf (0 discarded as redundant)

Any suggestions would be much appreciated!