magikker / TreeHouse-Private

TreeHouse development.
GNU General Public License v3.0
0 stars 0 forks source link

Need way to pass bipartitions to and from the commandline #17

Open magikker opened 11 years ago

magikker commented 11 years ago

Currently we can search based on bipartitions but there's no obvious way to pass a bipartition to and from the commandline, this is because a bipartition is handled as two sets of taxa. They don't have their own data structure in the grammar. Ideallly the method chosen to solve this would also work for quartets/k-tets.

The first step in implementation is to come up with a mental model for how a specific bipartition should be referenced. This would then be added to the grammar and the code base.

this would enable work to be done on functions which are meant to allow the user to directly interact with bipartitions. Such as asking the question, "Which quartets do these two bipartititons have in common?" or "What distinguishing bipartitions are in these sets of trees?"

macember commented 11 years ago

Could we not just reference the bipartitions by index? Perhaps the print_biparttable user function could number each bipartition, and then the user could pass in an int which represents the bipartition

magikker commented 11 years ago

From Marc Smith:

it seems the best way from the command line to get at bipartitions would be to have a get_bipartitions() command. This could be in different forms:

get_bipartitions(tid) // supply a tree_id get_bipartitions(all) // gets all bipartiions in tree collection get_bipartitions( [[a,b,c], [d,e,f]] ) // where [[a,b,c], [d,e,f]] is a bipartition literal or get_bipartitions( [[a,b,c], [d,e,f]], tid ) // where you specify a bipartition literal and tree id

just thinking out loud...

magikker commented 11 years ago

We can reference them by index, and internally do. But once one the commandline it'd be nice to have a way for the users to keep them straight. Trees are also referenced by index, and I wouldn't want things to get confused.

Which means that we'd probably want an index, but want to annotate it such that it's known to be a bipartition. Like B1 or Bid1 instead of just 1. Currently, if someone uses a variable on the commandline it prints what the varible contains. I could see wanting to have the command line return the nicely printed version of the the bipartition when someone types in Bid1. Also we'd need the support functions like Marc suggested so we've got a reasonable way to get the bipartition..

That's surely one way to solve the problem, but what it doesn't do is provide a good way for us to return a quartet or other non-bipartition relationship to the command line... So if I get some quartets as a result I might want to pass them to a search function.... not sure how the bipartition indexes help with that. Ideally we'd go with a solution for working with all types of relationships. And I think that might be a little trickier.... but I also tend to over think things.

magikker commented 11 years ago

I think that as we move into exploring quartets and k-tets we'll need a generalized way to handle different relationships.

I'm thinking that we might need to add a new "relationship" type to the language. Something like <A, B, C | D, E, F> is a k-tet.

I like using the | as an edge, but I need to wrap the whole construct in some sort of brace with out making the code too confusing.... () are function calls, {} are sets, [] is list addressing. So.... <>? I hate those things.... Maybe I should free up [] as my list addressing, and use them for relationships.... I'm open to ideas on how to avoid making the language too ugly or confusing.

marclsmith commented 11 years ago

you could also go with double braces: [[ ]] or maybe tag your brackets with tag t to indicate what kind of relationship you're bracketing: t[ ] or even [t: ]

Marc L. Smith Associate Professor Undergraduate Research Summer Institute (URSI) Director Committee on Academic Technologies (CAT) Chair

Computer Science Department Vassar College, Box 399 124 Raymond Avenue Poughkeepsie, NY 12604

e-mail: mlsmith@cs.vassar.edu web: http://www.cs.vassar.edu/people/mlsmith/top

On Thu, Jun 27, 2013 at 2:53 PM, magikker notifications@github.com wrote:

I think that as we move into exploring quartets and k-tets we'll need a generalized way to handle different relationships.

I'm thinking that we might need to add a new "relationship" type to the language. Something like is a k-tet.

I like using the | as an edge, but I need to wrap the whole construct in some sort of brace with out making the code too confusing.... () are function calls, {} are sets, [] is list addressing. So.... <>? I hate those things.... Maybe I should free up [] as my list addressing, and use them for relationships.... I'm open to ideas on how to avoid making the language too ugly or confusing.

— Reply to this email directly or view it on GitHubhttps://github.com/magikker/TreeHouse-Private/issues/17#issuecomment-20146941 .