tanghaibao / goatools

Python library to handle Gene Ontology (GO) terms
BSD 2-Clause "Simplified" License
781 stars 210 forks source link

is it possible to plot a list of GO terms? #102

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hi,

I want to visualize the structural (spatial) relationship for a list of GO terms in the GO hierarchy (for example, MF), is it possible to use your code for that purpose? e.g. python plot_go_term.py --term_list_file term_list.txt?

Thanks!

dvklopfenstein commented 6 years ago

Thank you for your interest in GOOATOOLS and taking the time to write us expressing interest in an important feature.

Yes! We are adding the ability to read a list of GO IDs from a file soon in a GO plotting script which will take advantage of new (hopefully) soon-to-published functionality in GOATOOLS.

For example, to visually confirm that we correctly set up a test to read up and down the GO hierarchy using both the standard _isa (black arrows) and the optional relationship attributes(dashed magenta arrows), we do a plot using this text file containing a list of GO IDs and colors:

#-------------------------------------------
# Source for tests up GO hierachy
GO:0007389#fffdda # Yellow

# Ancestors up hierarchy through is_a attribute
GO:0008150#e0ffdc # Green
GO:0032501#e0ffdc # Green

# Ancestors up hierarchy through relationship attributes
GO:0007275#deddff # Purple 
GO:0048856#deddff # Purple 
GO:0032502#deddff # Purple

#-------------------------------------------
# Source for tests down GO hierarchy
GO:0003143#fffdda # Yellow  embryonic heart tube morphogenesis

# Ancestors up hierarchy through is_a attribute
GO:0003146#e0ffdc # Green

# Ancestors up hierarchy through relationship attributes
GO:0003304#deddff # Purple

And run this command:

$ go_plot -i test_gos.txt --obo=../goatools/tests/data/heartjogging.obo -r -o heartjogging_test.png

And see this plot: heartjogging_test

Here is the help doc for the script using new functionality:

Command-line script to create GO term diagrams

Usage:
  go_plot.py [GO ...] [options]

Options:
  -h --help                            show this help message and exit
  -i --go_file=<file.txt>              GO IDs in an ASCII file
  -o <file.png>, --outfile=<file.png>  Plot file name [default: go_plot.png]
  -r --relationship                    Plot all relationships
  -s <sections.txt> --sections=<sections.txt>  Sections file for grouping
  -S <sections module str>             Sections file for grouping

  --gaf=<file.gaf>                     Annotations from a gaf file
  --gene2go=<gene2go>                  Annotations from a gene2go file downloaded from NCBI

  --obo=<file.obo>                     Ontologies in obo file [default: go-basic.obo].

  -t <title>, --title=<title>          Title string to place in image
  -p --parentcnt                       Include parent count in each GO term
  -c --childcnt                        Include child count in each GO term
  --shorten                            Shorten the GO name on plots
  --mark_alt_id                        Add 'a' if GO ID is an alternate ID: GO:0007582a
  --draw-children                      Draw children. By default, they are not drawn.
  --go_aliases=<go_aliases.txt>        ASCII file containing letter alias

  --norel                              Don't load relationship from the GO DAG

This code will be released soon.

Thank you very much for your interest in GOATOOLS.

dvklopfenstein commented 6 years ago

FYI, To compute the Resnik and Lin similarities between pairs of these GOs:

human Information content 5.682202 GO:0015318 inorganic molecular entity transmembrane transporter activity
human Information content 4.660596 GO:0140096 catalytic activity, acting on a protein
human Information content 7.125990 GO:0140097 catalytic activity, acting on DNA
human Information content 6.479823 GO:0140098 catalytic activity, acting on RNA
human Information content 7.622427 GO:0140101 catalytic activity, acting on a tRNA
human Information content 4.787814 GO:0140110 transcription regulator activity

Like this:

GO #1      GO #2      Resnik Lin
---------- ---------- ------ -------
GO:0015318 GO:0140096 2.6690 -0.5120
GO:0015318 GO:0140097 2.6690 -0.4628
GO:0015318 GO:0140098 2.6690 -0.4815
GO:0015318 GO:0140101 2.6690 -0.4369
GO:0015318 GO:0140110 2.6690 -0.4828
GO:0140096 GO:0140097 3.3710 -0.6298
GO:0140096 GO:0140098 3.3710 -0.6574
GO:0140096 GO:0140101 3.3710 -0.5921
GO:0140096 GO:0140110 2.6690 -0.5220
GO:0140097 GO:0140098 3.3710 -0.5933
GO:0140097 GO:0140101 3.3710 -0.5395
GO:0140097 GO:0140110 2.6690 -0.4709
GO:0140098 GO:0140101 5.4578 -0.9061
GO:0140098 GO:0140110 2.6690 -0.4903
GO:0140101 GO:0140110 2.6690 -0.4442

See new test: ./tests/test_semantic_similarity_best4lex.py

dvklopfenstein commented 6 years ago

I will close this issue when our publication is out and our unpublished code is released.

ghost commented 6 years ago

Okay, I am looking forward to the new functionality! Thank you!

--

Linhua (Alex) Wang

Bioinformatician

Department of Genomic Science/Genetics

Icahn School of Medicine at Mount Sinai, New York

On Sat, May 19, 2018 at 1:37 PM, DV Klopfenstein notifications@github.com wrote:

Thank you for your interest in GOOATOOLS and taking the time to write us expressing interest in an important feature.

Yes! We are adding the ability to read a list of GO IDs from a file soon in a GO plotting script which will take advantage of new (hopefully) soon-to-published functionality in GOATOOLS.

For example, to visually confirm that we correctly set up a test to read up and down the GO hierarchy using both the standard is_a and the optional relationship attributes, we do a plot using this text file containing a list of GO IDs and colors:

-------------------------------------------

Source for tests up GO hierachy

GO:0007389#fffdda # Yellow

Ancestors up hierarchy through is_a attribute

GO:0008150#e0ffdc # Green GO:0032501#e0ffdc # Green

Ancestors up hierarchy through relationship attributes

GO:0007275#deddff # Purple GO:0048856#deddff # Purple GO:0032502#deddff # Purple

-------------------------------------------

Source for tests down GO hierarchy

GO:0003143#fffdda # Yellow embryonic heart tube morphogenesis

Ancestors up hierarchy through is_a attribute

GO:0003146#e0ffdc # Green

Ancestors up hierarchy through relationship attributes

GO:0003304#deddff # Purple

And run this command:

$ go_plot -i test_gos.txt --obo=../goatools/tests/data/heartjogging.obo -r -o heartjogging_test.png

And see this plot: [image: heartjogging_test] https://user-images.githubusercontent.com/7278188/40271267-e639fdac-5b68-11e8-9727-4de136540787.png

Here is the help doc for the script using new functionality:

Command-line script to create GO term diagrams

Usage: go_plot.py [GO ...] [options]

Options: -h --help show this help message and exit -i --go_file= GO IDs in an ASCII file -o , --outfile= Plot file name [default: go_plot.png] -r --relationship Plot all relationships -s --sections= Sections file for grouping -S Sections file for grouping

--gaf= Annotations from a gaf file --gene2go= Annotations from a gene2go file downloaded from NCBI

--obo= Ontologies in obo file [default: go-basic.obo].

-t , --title=<title> Title string to place in image -p --parentcnt Include parent count in each GO term -c --childcnt Include child count in each GO term --shorten Shorten the GO name on plots --mark_alt_id Add 'a' if GO ID is an alternate ID: GO:0007582a --draw-children Draw children. By default, they are not drawn. --go_aliases=<go_aliases.txt> ASCII file containing letter alias</p> <p>--norel Don't load relationship from the GO DAG</p> <p>This code will be released soon.</p> <p>Thank you very much for your interest in GOATOOLS.</p> <p>— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <a href="https://github.com/tanghaibao/goatools/issues/102#issuecomment-390420801">https://github.com/tanghaibao/goatools/issues/102#issuecomment-390420801</a>, or mute the thread <a href="https://github.com/notifications/unsubscribe-auth/AP58pdOplcR36InB57APj5nncLRtdEFxks5t0FhtgaJpZM4UBrWQ">https://github.com/notifications/unsubscribe-auth/AP58pdOplcR36InB57APj5nncLRtdEFxks5t0FhtgaJpZM4UBrWQ</a> .</p> </blockquote> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/camiloaruiz"><img src="https://avatars.githubusercontent.com/u/23655032?v=4" />camiloaruiz</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>Hi! Thanks for all of your hard work with goatools!</p> <p>I just wanted to see if there's any time estimate for when the go_plot function will be released? Alternatively, is there a way to get access to that code early (even if it's in rough shape)?</p> <p>Thanks!</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/dvklopfenstein"><img src="https://avatars.githubusercontent.com/u/7278188?v=4" />dvklopfenstein</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>I added the code needed to do the new plotting. Please try it out. It should be running well as it has been around for a while and has been used to compare other tools GOEA results to GOATOOLS and for work in a PhD Thesis. But it is newly public, so please open an issue if needed, even if for enhancement requests.</p> <p>The script is:</p> <pre><code>scripts/go_plot.py</code></pre> <p>This is half of our new code. The other half will be added soon.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/camiloaruiz"><img src="https://avatars.githubusercontent.com/u/23655032?v=4" />camiloaruiz</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>Thank you!</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/dvklopfenstein"><img src="https://avatars.githubusercontent.com/u/7278188?v=4" />dvklopfenstein</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>I added some example usages at: <a href="https://github.com/tanghaibao/goatools/blob/master/doc/md/README_plot_go.md">https://github.com/tanghaibao/goatools/blob/master/doc/md/README_plot_go.md</a></p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/dvklopfenstein"><img src="https://avatars.githubusercontent.com/u/7278188?v=4" />dvklopfenstein</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>Thank for your interest in GOATOOLS and taking your time to write us. </p> <p>I am going to close this issue now because the plotting script, scripts/go_plot.py, can plot multiple user-specified GO and their ancestors in a single plot.</p> <p>Please open a new issue if needed or if you would like new functionality to the plotting script.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/dadarenedo"><img src="https://avatars.githubusercontent.com/u/36333656?v=4" />dadarenedo</a> commented <strong> 3 years ago</strong> </div> <div class="markdown-body"> <blockquote> <p>I added some example usages at: <a href="https://github.com/tanghaibao/goatools/blob/master/doc/md/README_plot_go.md">https://github.com/tanghaibao/goatools/blob/master/doc/md/README_plot_go.md</a></p> </blockquote> <p>Hi, is there any other link to view this? It would be really helpful to look at how to implement this new function. Thanks!</p> </div> </div> <div class="page-bar-simple"> </div> <div class="footer"> <ul class="body"> <li>© <script> document.write(new Date().getFullYear()) </script> Githubissues.</li> <li>Githubissues is a development platform for aggregating issues.</li> </ul> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.5.1/dist/jquery.min.js"></script> <script src="/githubissues/assets/js.js"></script> <script src="/githubissues/assets/markdown.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/highlight.min.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/languages/go.min.js"></script> <script> hljs.highlightAll(); </script> </body> </html>