nf-core / genomeqc

Compare the quality of multiple genomes, along with their annotations.
https://nf-co.re/genomeqc
MIT License
3 stars 8 forks source link

New feature: Check for overlapping genes in GFF (bedtools intersect) #93

Open chriswyatt1 opened 3 days ago

chriswyatt1 commented 3 days ago

Another way we discussed to assess annotation quality is to look at overlapping genes, sense and antisense.

You would expect that similar genomes would have the same rate of these classes, if not, there could have been a technical issue while building the gene annotation. So it is useful to know.

chriswyatt1 commented 3 days ago

Or could use https://agat.readthedocs.io/en/latest/tools/agat_sp_fix_overlaping_genes.html

chriswyatt1 commented 3 days ago

bedtools intersect requires two gff files, so I don't think it will work in our case

chriswyatt1 commented 3 days ago

Thinking maybe agat_spstatistics may provide this number:

Number of gene                               830
Number of pseudogene                         6
Number of mrna                               836
Number of cds                                836
Number of exon                               836
Number of exon in cds                        836
Number gene..pseudogene overlapping          107