z0on / GO_MWU

Rank-based Gene Ontology analysis of gene expression data
36 stars 17 forks source link

GO enrichment analysis for non-model species #9

Open sekhwal opened 3 years ago

sekhwal commented 3 years ago

Hi, I am trying to perform GO enrichment analysis for non-model plants. Please let me know if I can use GO_MWU with non-model species for GO enrichment analysis.

I tried to download a gaf file (annotation file) as reference from the GO database. Please let me know how to update it to use.

Thank you!

z0on commented 3 years ago

yes! you can definitely use the package for non-model organisms, that’s what is was designed for!

To get GO annotations, I strongly suggest using emapper: http://eggnog-mapper.embl.de/ http://eggnog-mapper.embl.de/ then convert the results into files suitable for GO_MWU (and KOGMWU) following instructions here: https://github.com/z0on/emapper_to_GOMWU_KOGMWU https://github.com/z0on/emapper_to_GOMWU_KOGMWU

Emapper will return nice annotations if you upload your translated protein sequences, or protein-coding DNA sequences. It also works with raw assembly that might be fragmented and contain frameshifts, but it is less sensitive that way. If you have a de-novo transcriptome, you can generate your protein sequences based on blastx results, using this script: https://github.com/z0on/annotatingTranscriptomes/blob/master/CDS_extractor_v2.pl

Misha

On Jun 2, 2021, at 3:19 PM, sekhwal @.***> wrote:

Hi, I am trying to perform GO enrichment analysis for non-model plants. Please let me know if I can use GO_MWU with non-model species for GO enrichment analysis.

I tried to download a gaf file (annotation file) as reference from the GO database. Please let me know how to update it to use.

Thank you!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/9, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGHTDVZWLGG7VHDZEJLTQ2G6PANCNFSM457M7KVA.

sekhwal commented 3 years ago

Thank you for the information. Does the output results contain network graph and statistical values (e.g. p-value). I have already GO annotation.

Here is my input file format. Please let me know how to improve the input file.

Gene Similar search annotation GO:MF GO:CC GO:BP
SE_04086 XP_006845785.1 probable_inactive_purple_acid_phosphatase_28_isoform_X4 GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1), GO:0005576-extracellular_region(L=1), GO:0003824-catalytic_activity(L=1),GO:0005488-binding(L=1),
SE_02598 XP_025797078.1 eukaryotic_peptide_chain_release_factor_subunit_1-3 GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0071840-cellular_component_organization_or_biogenesis(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1), GO:0005488-binding(L=1),
SE_21286 XP_019108059.1 uncharacterized_protein GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0051179-localization(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0043226-organelle(L=1), GO:0003824-catalytic_activity(L=1),GO:0005215-transporter_activity(L=1),GO:0005488-binding(L=1),
SE_21477 XP_027940045.1 probable LRR receptor-like serine/threonine-protein kinase GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0043226-organelle(L=1), GO:0003824-catalytic_activity(L=1),GO:0005488-binding(L=1),
SE_22395 XP_021830538.1 dolichyl-diphosphooligosaccharide--protein_glycosyltransferase GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0065007-biological_regulation(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0032991-macromolecular_complex(L=1),GO:0043226-organelle(L=1), GO:0003824-catalytic_activity(L=1),
z0on commented 3 years ago

Yes - the typical output is the graph in the very beginning of the REDME.md file, containing similarioties of GO terms in the form of a tree, and p-values indicated by the font:

You would need to format your GO annotations into a simpler format that would be understood by GO_MWU (as described in the README.md, Details on the input format.)

On Jun 2, 2021, at 5:22 PM, sekhwal @.***> wrote:

Thank you for the information. Does the output results contain network graph and statistical values (e.g. p-value). I have already GO annotation.

Here is my input file format.

Gene Similar search annotation GO:MF GO:CC GO:BP SEGI_04086 XP_006845785.1 probable_inactive_purple_acid_phosphatase_28_isoform_X4 GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1), GO:0005576-extracellular_region(L=1), GO:0003824-catalytic_activity(L=1),GO:0005488-binding(L=1), SEGI_02591 XP_025797078.1 eukaryotic_peptide_chain_release_factor_subunit_1-3 GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0071840-cellular_component_organization_or_biogenesis(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1), GO:0005488-binding(L=1), SEGI_21288 XP_019108059.1 uncharacterized_protein GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0051179-localization(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0043226-organelle(L=1), GO:0003824-catalytic_activity(L=1),GO:0005215-transporter_activity(L=1),GO:0005488-binding(L=1), SEGI_21478 XP_027940045.1 probable LRR receptor-like serine/threonine-protein kinase GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0043226-organelle(L=1), GO:0003824-catalytic_activity(L=1),GO:0005488-binding(L=1), SEGI_22391 XP_021830538.1 dolichyl-diphosphooligosaccharide--protein_glycosyltransferase GO:0008152-metabolic_process(L=1),GO:0009987-cellular_process(L=1),GO:0044699-single-organism_process(L=1),GO:0065007-biological_regulation(L=1), GO:0005623-cell(L=1),GO:0016020-membrane(L=1),GO:0032991-macromolecular_complex(L=1),GO:0043226-organelle(L=1), GO:0003824-catalytic_activity(L=1), — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/9#issuecomment-853421689, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGB2N65KREO5IOMDOU3TQ2VL3ANCNFSM457M7KVA.