Closed hputnam closed 4 years ago
KO is listed under phylogenomic databases
Montipora predicted protein - http://cyanophora.rutgers.edu/montipora/Mcap.protein.fa.gz nucleotide CDS - http://cyanophora.rutgers.edu/montipora/Mcap.mRNA.fa.gz
Here is a product for Mcap https://github.com/sr320/nb-2020/blob/master/M_capitata/analyses/Mcap-GO-KO-Kegg.tab
Pact protein fasta?
Working plan to get the Pact protein fasta... Use the Structural_annotation_abintio.gff to pull T1 only (longest transcript called by AUGUSTUS) and pull the coding sequence from that gff and the protein seq seq from that gff for all predicted genes. This seems most comparable to the gene CDS and protein files available for Mcap, which was also called with AUGUSTUS. @sr320 @shellytrigg @yaaminiv @mgavery @kubu4 any objections?
I guess I am surprised there is a genome and no gene / protein fasta already.
@sr320 genome and shell script for getting protein fasta are linked in issue #71
I have been double coding all morning :) I given in, @hputnam can I just have the fasta?
Link to Pacuta Augustus predicted protein fasta https://osf.io/shqjx/
MD5 has been checked from local computer to OSF
now have a blast annotation https://gannet.fish.washington.edu/seashell/bu-mox/scrubbed/061620-Pact-annotation/Pact_blastp_sp.tab Moving on to GO Kegg etc..
Here is the product for Pact: https://github.com/sr320/nb-2020/blob/master/P_acuta/analyses/Pact-GO-KO-Kegg.tab
@sr320 can you please add the uniprot download or version date?
download date 2019-0109
blast to swissprot and use uniprot to obtain GO and KO