gamcil / clinker

Gene cluster comparison figure generator
MIT License
507 stars 66 forks source link

Feature request(s) - On-gene label, autosource group names, legend titles editing, support for manually adding text boxes #32

Open UriNeri opened 3 years ago

UriNeri commented 3 years ago

First off, Wow! well done @gamcil, well done. I've went over so many tools and packages and what not lately to try and generate such figures and clinker is by far the easiest and most visually appealing.
Here are some feature requests I would love to see added, sorry if any of them are already possible:

  1. On-gene labeling - placing the label within the gene arrow (instead of above it).
  2. Auto source group names - i.e. instead of "cluster_n" or "group_n" in the legend, automatically use the most frequent attribute (of user choice) of that group members - i.e. "product". Alternatively, add option in the gene label editor (that pops when clicking a gene) to assign it's attribute to the entire group.
  3. Legend titles editing, support for manually adding text boxes (and editing their font size via an added section)
  4. Support for Non-grey color gradient for the identity (e.g. yellow to red).
  5. Option to only display gene labels only for a selected genomes.
  6. Add "reset" option.
  7. Enable support for non CDS containing sequences, and for segmented sequences on the same track.
  8. Support for circularity - i.e. setting a new coordiante as the 0 position and moving what was before it to the end of the track.

Thanks again for this great tool, and please let me know if you need someone to beta test anything :-)

xvazquezc commented 3 years ago

Related to point 4, it would be great if there was an option to change the range of the colour gradient. If you compare close relatives it is hard to distinguish differences in the identity values.

gamcil commented 3 years ago

Some good suggestions - sorry for being to late responding, been pretty busy lately with non clinker stuff :)

As of v0.0.17 (just released), number 1 is implemented.

Other points: 3: Legend titles can be edited (click to open input box). Dynamically adding extra boxes is difficult though, would need to think about how to do that 4: This will come eventually. @xvazquezc by range, do you mean just displaying e.g. 40-70% instead of 0-100%, but the start/stop colours are still 0/100%? Or actually setting e.g. 40% to white and 70% to black? The former is easy, the latter I'm not sure how best then to handle values outside the range 5: I'm hoping to put in some more granular per-gene/locus/cluster settings eventually, but not a priority at the moment 6: I did have one at one point, but refreshing the page in the browser should already do that 7: Yeah I've had some requests for this, but it would require a pretty big rework since currently clinker only reads in genes. Need to think about this one 8: I think I know how this could be done, but again will have to think

xvazquezc commented 3 years ago

Hi @gamcil , my idea is to adjust the colour scale to the range of interest, i.e. the second option you mention. I'd usually be only interested in moving the lower threshold - it could actually be linked to the identity threshold value... not idea how it is coded but if you have a range of identity values and you set the colour range to be within than range dynamically - not based on the static 0-100 range- should work. anvi'o does something like this in its interactive interface - not sure if helpful but... https://github.com/merenlab/anvio/blob/master/anvio/data/interactive/js/charts.js

The problem with the 1st approach you mention is that the colours wouldn't really change, so if most of the identities are centred around a given value, restricting the range won't really make better fine-grained distinction. e.g. if most values are between 90-100%, with the current approach they pretty much will look black - even if you modify the range to include only values within that range

UriNeri commented 3 years ago

Thanks for taking the time and replying in such detail!

7: Yeah I've had some requests for this, but it would require a pretty big rework since currently clinker only reads in genes. Need to think about this one

Would also be nice if instead of using dealing with alignments and clustering internally, clinker could accept as input a blast/diamond generated all-vs-all hittable, or even just a list of clusters and their members (something like what cd-hit output looks like). +The blastn option would be able to handle non CDS seqs, though I guess you could also split all sequence features by their alphabeit, then do the clustering for AA, DNA, RNA... separately.
To be clear, I'm not complaining about clinker, rather I just think biopython's aligners are too slow =\ which kinda restricts clinker. Thanks again and keep up the great work!