Open sekhwal opened 3 years ago
yes! you can definitely use the package for non-model organisms, that’s what is was designed for!
To get GO annotations, I strongly suggest using emapper: http://eggnog-mapper.embl.de/ http://eggnog-mapper.embl.de/ then convert the results into files suitable for GO_MWU (and KOGMWU) following instructions here: https://github.com/z0on/emapper_to_GOMWU_KOGMWU https://github.com/z0on/emapper_to_GOMWU_KOGMWU
Emapper will return nice annotations if you upload your translated protein sequences, or protein-coding DNA sequences. It also works with raw assembly that might be fragmented and contain frameshifts, but it is less sensitive that way. If you have a de-novo transcriptome, you can generate your protein sequences based on blastx results, using this script: https://github.com/z0on/annotatingTranscriptomes/blob/master/CDS_extractor_v2.pl
Misha
On Jun 2, 2021, at 3:19 PM, sekhwal @.***> wrote:
Hi, I am trying to perform GO enrichment analysis for non-model plants. Please let me know if I can use GO_MWU with non-model species for GO enrichment analysis.
I tried to download a gaf file (annotation file) as reference from the GO database. Please let me know how to update it to use.
Thank you!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/9, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGHTDVZWLGG7VHDZEJLTQ2G6PANCNFSM457M7KVA.
Thank you for the information. Does the output results contain network graph and statistical values (e.g. p-value). I have already GO annotation.
Here is my input file format. Please let me know how to improve the input file.
Gene | Similar search | annotation | GO:MF | GO:CC | GO:BP |
---|---|---|---|---|---|
SE_04086 | XP_006845785.1 | probable_inactive_purple_acid_phosphatase_28_isoform_X4 | GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1), | GO:0005576-extracellular_region(L=1), | GO:0003824-catalytic_activity(L=1),GO:0005488-binding(L=1), |
SE_02598 | XP_025797078.1 | eukaryotic_peptide_chain_release_factor_subunit_1-3 | GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0071840-cellular_component_organization_or_biogenesis(L=1), | GO:0005623-cell(L=1),GO:0016020-membrane(L=1), | GO:0005488-binding(L=1), |
SE_21286 | XP_019108059.1 | uncharacterized_protein | GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0051179-localization(L=1), | GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0043226-organelle(L=1), | GO:0003824-catalytic_activity(L=1),GO:0005215-transporter_activity(L=1),GO:0005488-binding(L=1), |
SE_21477 | XP_027940045.1 | probable LRR receptor-like serine/threonine-protein kinase | GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1), | GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0043226-organelle(L=1), | GO:0003824-catalytic_activity(L=1),GO:0005488-binding(L=1), |
SE_22395 | XP_021830538.1 | dolichyl-diphosphooligosaccharide--protein_glycosyltransferase | GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0065007-biological_regulation(L=1), | GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0032991-macromolecular_complex(L=1),GO:0043226-organelle(L=1), | GO:0003824-catalytic_activity(L=1), |
Yes - the typical output is the graph in the very beginning of the REDME.md file, containing similarioties of GO terms in the form of a tree, and p-values indicated by the font:
You would need to format your GO annotations into a simpler format that would be understood by GO_MWU (as described in the README.md, Details on the input format.)
On Jun 2, 2021, at 5:22 PM, sekhwal @.***> wrote:
Thank you for the information. Does the output results contain network graph and statistical values (e.g. p-value). I have already GO annotation.
Here is my input file format.
Gene Similar search annotation GO:MF GO:CC GO:BP SEGI_04086 XP_006845785.1 probable_inactive_purple_acid_phosphatase_28_isoform_X4 GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1), GO:0005576-extracellular_region(L=1), GO:0003824-catalytic_activity(L=1),GO:0005488-binding(L=1), SEGI_02591 XP_025797078.1 eukaryotic_peptide_chain_release_factor_subunit_1-3 GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0071840-cellular_component_organization_or_biogenesis(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1), GO:0005488-binding(L=1), SEGI_21288 XP_019108059.1 uncharacterized_protein GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0051179-localization(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0043226-organelle(L=1), GO:0003824-catalytic_activity(L=1),GO:0005215-transporter_activity(L=1),GO:0005488-binding(L=1), SEGI_21478 XP_027940045.1 probable LRR receptor-like serine/threonine-protein kinase GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0043226-organelle(L=1), GO:0003824-catalytic_activity(L=1),GO:0005488-binding(L=1), SEGI_22391 XP_021830538.1 dolichyl-diphosphooligosaccharide--protein_glycosyltransferase GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0065007-biological_regulation(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0032991-macromolecular_complex(L=1),GO:0043226-organelle(L=1), GO:0003824-catalytic_activity(L=1), — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/9#issuecomment-853421689, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGB2N65KREO5IOMDOU3TQ2VL3ANCNFSM457M7KVA.
Hi, I am trying to perform GO enrichment analysis for non-model plants. Please let me know if I can use GO_MWU with non-model species for GO enrichment analysis.
I tried to download a gaf file (annotation file) as reference from the GO database. Please let me know how to update it to use.
Thank you!