theosanderson / taxonium

A tool for exploring very large trees in the browser
http://taxonium.org
GNU General Public License v3.0
95 stars 17 forks source link

ENH: Selection-based search macros #593

Open J-Wall opened 1 month ago

J-Wall commented 1 month ago

Thank you very much Theo and contributors for this amazing tool.

I would like to suggest a feature where searches could be constructed based on the metadata values associated with the selected node (or perhaps even it's children, although I expect that would be a bit more complicated).

I'm not 100% sure on what the best design would be for this, but here's my suggestion:

This would enable the following workflow:

With a tree like this:

 ┌┬sample_1
─┤└sample_2
 └┬sample_3
  └sample_4
and an associated metadata table like this: seqid field1 field2
sample_1 A A
sample_2 B A
sample_3 A B
sample_4 B B

The user could set a search as field1 "Matches selection" field2. Then, by selecting a sample in the tree (e.g. sample_3)

 ┌┬sample_1
─┤└sample_2
 └┮**sample_3**  <- selection
  └sample_4

all samples which have a field1 value which has a match with the field2 value from the selected node (in this case "B") would be circled

 ┌┬sample_1
─┤└○︎sample_2  <- highlighted by search
 └┮**sample_3**  <- selection
  └○︎sample_4  <- highlighted by search

And importantly, to change the search, all the user has to do is select a different node (e.g. if they select sample_2):

 ┌┮○︎**sample_1**  <- selected and highlighted by search
─┤└sample_2
 └┬○︎sample_3  <- highlighted by search
  └sample_4

This would reduce the amount of copy/pasting in my workflow, and I think would be a pretty useful functionality generally.

theosanderson commented 1 month ago

Thank you for raising this issue. Are you able to share the specific use case that you would find this useful for? (I can already think of some but would be great to understand if they are the same ones you are thinking about)

J-Wall commented 1 month ago

There are a couple of use cases.

The most obvious is the special case where you are just operating on one field (I.e. search for everything with the same value for a particular field). Colouring already goes some way to solving this (one can scan the tree for groups of colours for example) but it has some pitfalls (e.g. if you have more values than perceptibly different colours, or you need the visualisation to be accessible to colour blind people etc.). In general, this could be used as a practical alternative to colouring when you are just interested in one value at a time. Hard to say all the reasons that could be useful (the incredible thing about taxonium is its flexibility).

As a more concrete example, when looking at a (large) phylogeny (maybe you're doing a taxonomic revision or something) and you want to assess monophyly at the family level. Instead of clicking on a node, copying its family and pasting in a search, you could just have the search predefined, and just click on a node of interest and immediately assess if it's family is monophyletic by looking at the tree/minimap. You could take this even further and have a set of searches set up at different taxonomic levels (say you have class, order, family, genus as metadata fields).

Now, the reason I suggested the generalisation of splitting this into two fields (I.e. one for the search query and one for the value from the selection) is a bit harder to come up with examples for, although it would certainly be useful for my workflow. I'd be happy to share privately but it might be a bit long-winded for a github issue.

theosanderson commented 4 weeks ago

Thank you for the reply and for capturing the issue. I can see this being useful. It's unlikely to be implemented in the immediate future due to resource-constraints but we hope to get to it in a bit

J-Wall commented 4 weeks ago

Thanks. Would you be open to a PR if i find time to give it a crack? (not promising anything)