Closed nmatthews323 closed 5 years ago
Nice.
More on (3): We used this rather than mirbase as apparently the lab disagreed with other's definitions of miRNAs...
Nice, as for point 4) I've found this file on the cluster:
/data/public_data/chlamydomonas/20140726_phytozomeV10_Creinhardtii_281_v5.5.annotation/Creinhardtii_281_v5.5.repeatmasked_assembly_v5.0.gff3
The first few lines are
##gff-version 3
##date 2012-01-18
##sequence-region c4058c6ad52899e4141d968721bc69e713269381531m1M33
chromosome_5 RepeatMasker similarity 998186 998222 16.2 + . ID=330541.1;Name=(CCG)n;Target=(CCG)n 3 39
chromosome_5 RepeatMasker similarity 1000515 1000651 20.4 - . ID=330541.2;Name=rnd-1_family-12;Target=rnd-1_family-12 59 207
This seems the Transposons, but the same directory also contains genes:
==> Creinhardtii_281_v5.5.gene.gff <==
##gff-version 3
##annot-version v5.5
chromosome_1 phytozomev10 gene 18766 20237 . + . ID=Cre01.g000017.v5.5;Name=Cre01.g000017
chromosome_1 phytozomev10 mRNA 18766 20237 . + . ID=Cre01.g000017.t1.1.v5.5;Name=Cre01.g000017.t1.1;pacid=30789166;longest=1;Parent=Cre01.g000017.v5.5
chromosome_1 phytozomev10 five_prime_UTR 18766 19162 . + . ID=Cre01.g000017.t1.1.v5.5.five_prime_UTR.1;Parent=Cre01.g000017.t1.1.v5.5;pacid=30789166
chromosome_1 phytozomev10 CDS 19163 19178 . + 0 ID=Cre01.g000017.t1.1.v5.5.CDS.1;Parent=Cre01.g000017.t1.1.v5.5;pacid=30789166
chromosome_1 phytozomev10 CDS 19329 19948 . + 2 ID=Cre01.g000017.t1.1.v5.5.CDS.2;Parent=Cre01.g000017.t1.1.v5.5;pacid=30789166
chromosome_1 phytozomev10 three_prime_UTR 19949 20237 . + . ID=Cre01.g000017.t1.1.v5.5.three_prime_UTR.1;Parent=Cre01.g000017.t1.1.v5.5;pacid=30789166
chromosome_1 phytozomev10 gene 20356 23957 . + . ID=Cre01.g000033.v5.5;Name=Cre01.g000033
I suppose that's the ones you based your introns on etc. on?
Key lines in the ChlamydomonasTranscriptNameConversionBetweenReleases.Mch12b.txt:
4/10/2014
The next line of the file contains column headings, starting with a comment character
('#'). Columns are space-padded to 25 characters.
These are the column headings, in order, together with an explanation of what version they correspond to
5.5 JGI v5.5 in Phytozome v10
3.1 JGI v3.1 (published in genome paper Merchant et al., 2007)
Genbank Genbank submission of genome and annotations from Merchant et al. (2007)
4 JGI v4 annotations
4.3 JGI v4.3 (based on Augustus u10.2 annotations)
u5 Augustus u5 annotations
u9 Augustus u9 annotations
5.3.1 JGI v5.3.1 in Phytozome v9.1
...
JGI v5.5 (Phytozome 10) Augustus update 11.6 (u11.6)-based annotations on v5 assembly, released as JGI v5.5 in Phytozome 10
So I guess this concludes the annotation: JGI v5.5 (Phytozome 10) based annotations on v5 assembly
Just to update on this:
-I had a look at phytozome, it looks like most of the annotations haven't changed. -I've written a script which calculates introns from the phytozome, and uploaded the resulting GFF3 file. -Need to look at whether transposon and methylation annotations have changed. Also need to go through specific miRNAs identified in Adrian's paper and in literature.
Anything else?