magikker / TreeHouse-Private

TreeHouse development.
GNU General Public License v3.0
0 stars 0 forks source link

Identify a general graphical system for TreeHouse vizualizations #43

Open magikker opened 11 years ago

magikker commented 11 years ago

Currently we use Gnuplot and Nwutils. It'd be nice to ID something now that'll be robust for future use as to limit future re-coding.

Maybe we continue with Gnuplot. Maybe we pipe things to python and let matplotlib do the work? Maybe we use a C++ package?

Maybe this? http://mathgl.sourceforge.net/

or this? http://root.cern.ch/drupal/

Open to ideas.

jaHoltz commented 11 years ago

One way or another we're continuing to use Nwutils, correct? There isn't any other program to display trees based off newick strings floating around is there?

magikker commented 11 years ago

I think for visualization stuff in general, I'd just like to get around having intermediate files where possible.

For Trees, I know there's a host of options but I'm not sure how many could be compiled along with TreeHouse. I think that's the nice thing about Nwutils.

jaHoltz commented 11 years ago

It's not exactly a graphical system but I used R for an alternative MDS, and plotted with R. I also know it can work with heatmaps, the other suggested visualization here. I'm not sure if that's along the lines of what you intended or not. I just thought I'd mention it.

magikker commented 11 years ago

R is really nasty to work with in general and has hard caps on the amount memory it can use at one time. I've used it to check answers, but it wouldn't be something we'd want to integrate.

marclsmith commented 11 years ago

This all comes as news to me. What I've heard and read about R led me to believe just the opposite. A quick google on R memory limits turns this reference page up: http://stat.ethz.ch/R-manual/R-devel/library/base/html/Memory-limits.html It looks to me like 64-bit Linux running R for our purposes wouldn't lead to memory limitations. I had suggested it to Jarrett Based on its wide availability, popularity In statistical analysis of large data sets, And the beautiful graphs it's capable of producing. I also thought as an established Language for this purpose, it had little Chance of going away, unlike other smaller, more niche, packages. What has been your experience with R nastiness and memory limitations?

On Friday, July 5, 2013, magikker wrote:

R is really nasty to work with in general and has hard caps on the amount memory it can use at one time. I've used it to check answers, but it wouldn't be something we'd want to integrate.

— Reply to this email directly or view it on GitHubhttps://github.com/magikker/TreeHouse-Private/issues/43#issuecomment-20532749 .

Marc L. Smith Associate Professor Undergraduate Research Summer Institute (URSI) Director Committee on Academic Technologies (CAT) Chair

Computer Science Department Vassar College, Box 399 124 Raymond Avenue Poughkeepsie, NY 12604

e-mail: mlsmith@cs.vassar.edu web: http://www.cs.vassar.edu/people/mlsmith/top

magikker commented 11 years ago

R's main function isn't plotting or visualization so using it for our plotting is overkill... assuming that we're not going to use it's statistical functions. Also, R is a framework so if we want to use certain functions (maybe some of the plots) we'd need the user to have R and those packages.

Getting things into and out of R can be kinda nasty as it uses it's own formatting which can vary from package to package. Almost everything is returned as an R object and it's up to the user to figure out how to interact with it. I wouldn't class I/O as a strength.

The cap on R memory usage was 2 Gigs for a long time. It looks like that's been resolved on many systems. I was at a very R heavy conference just a couple years ago and no one was doing anything on the scale that we were because R just couldn't handle it both in terms of speed and memory limits.

If we're just going for plotting, lets stick with GnuPlot. If we need more than plotting we might want to consider it.

marclsmith commented 11 years ago

Okay! Thanks for the further elaboration. :-)

On Tuesday, July 9, 2013, magikker wrote:

R's main function isn't plotting or visualization so using it for our plotting is overkill... assuming that we're not going to use it's statistical functions. Also, R is a framework so if we want to use certain functions (maybe some of the plots) we'd need the user to have R and those packages.

Getting things into and out of R can be kinda nasty as it uses it's own formatting which can vary from package to package. Almost everything is returned as an R object and it's up to the user to figure out how to interact with it. I wouldn't class I/O as a strength.

The cap on R memory usage was 2 Gigs for a long time. It looks like that's been resolved on many systems. I was at a very R heavy conference just a couple years ago and no one was doing anything on the scale that we were because R just couldn't handle it both in terms of speed and memory limits.

If we're just going for plotting, lets stick with GnuPlot. If we need more than plotting we might want to consider it.

— Reply to this email directly or view it on GitHubhttps://github.com/magikker/TreeHouse-Private/issues/43#issuecomment-20690740 .

Marc L. Smith Associate Professor Undergraduate Research Summer Institute (URSI) Director Committee on Academic Technologies (CAT) Chair

Computer Science Department Vassar College, Box 399 124 Raymond Avenue Poughkeepsie, NY 12604

e-mail: mlsmith@cs.vassar.edu web: http://www.cs.vassar.edu/people/mlsmith/top