KamilSJaron / genomic-features-of-parthenogenetic-animals

A project for gathering and re-analysing all published asexual genomes
4 stars 2 forks source link

palindromes analysis - MCScanX #2

Closed KamilSJaron closed 5 years ago

KamilSJaron commented 6 years ago
KamilSJaron commented 6 years ago

Alternatively, or complementary we could get them also from the MUMmer output. Palindromes on the nucleotide level.

KamilSJaron commented 6 years ago

this analysis is done for all the species with assembly and annotation.

Yet, the outputs need to be parsed.

KamilSJaron commented 6 years ago

I found that a lot of the annotation files do not contain gene annotations.

I am not sure why _genomic.gff.gz file sometimes contains only contigs and sometimes gene annotation. Also some genomes have a separated ftp with annotation at /genomes//GFF/ with a .gff3 file. What makes these genomes special and why there is more than one way to store the annotation? No idea, but I got to curate the files and check that the annotation file actually contain what we need.

Examples: gff = contigs : Lcla1 gff = contigs + genes : Dcor1 separeted gff3 = Obir1

KamilSJaron commented 6 years ago

Species with (some sort of) result: Avag1 Dcor1 Fcan1 Lcla1

For the rest: TODO get protein sequences

KamilSJaron commented 6 years ago

Pvir1 annotation added, maybe incompatible with the NCBI annot.

This is a mess. Most of the genomes have no annotation in NCBI and the NCBI genome has no corresponding scaffold IDs to the annotation that is somewhere on ftp servers.

To finish this task we would have to:

Note that I wrote originally these script for all the genomes, but then I figured out that the scripts are not generally usable, therefore they are sort of "collection of copy-paste commans".

KamilSJaron commented 6 years ago

Missing annotations:

jensbast commented 6 years ago

Anan1 - Philipp gave me this dropbox link to the GFF https://www.dropbox.com/s/nrx6ccq3d3eaepd/Acrobeloides_nanus_v1.gff3.gz?dl=0

Mare3- the annotation should be on Wormbase within the next 3 weeks he said. He said it is better if we get it from there, because they reformat it and so on.

KamilSJaron commented 5 years ago

MCScanX is done for all but three mentioned above (Anan1 is done).

KamilSJaron commented 5 years ago

TODO: the last one is Mare3

KamilSJaron commented 5 years ago

c89fdd0a82d7eb80e58518f766321c674dadcc5a