sairum / tcsBU

a TCS network beautifier
MIT License
7 stars 4 forks source link

Re-connect nodes? #5

Open nickdwaters opened 5 years ago

nickdwaters commented 5 years ago

Greetings,

Exploring your program and noticed a couple things, probably related to my ignorance. I have high hopes your systems capability grows.

1) You can remove edges on either side of a node, which then floats away. I was expecting it to delete itself. 2) One cannot reconnect nodes, ergo, move connections or trim excessive nodes. 3) The taxa I am working with exhibit IBD / IBT and the number of nodes needed to connect clades is on the order of 50-100. This looks like a horrible mess of spaghetti on screen. The interface lags noticeably. 4) Support collapsing edges/nodes ala hash marks, or numeric? 5) The detangling process doesn't go to completion; it reaches a stable state where charges balance out in loops. What I did figure out is decreasing charge to ~ -350, decreasing edge length, decreasing elasticity to ~80, and increasing gravity to 1.4. Then it detangled pretty well. It worked so well it might become a singularity if I increased gravity any further. 6) Can you add gis support? :)

Warm Regards, Nick

sairum commented 5 years ago

Hi Nick tcsBU was side project when one of my MSc students needed to produce TCS networks as fast as possible, given the rate at which new (and highly dissimilar) sequences were being obtained for what we thought was a single species! At the time it fulfilled its job! I decided to open-source the code precisely because it may be useful to others. People can use it as it is or pick the code and modify it to fit their own needs. All this to say that currently I only have time to fix minor bugs or to do relevant changes (like the last one I did, suggested by EriFa a few days ago, changing the way how hapolotypes are scaled). Regarding your questions:

  1. You can remove edges on either side of a node, which then floats away. I was expecting it to delete itself. Removal of nodes and edges is working as intended. It's something to "fix" rapidly the network when you have double pathways towards an haplotype. But you are on your own. It should be used with caution. Double or multiple pathways are produced by TCS itself, meaning that it had no way to find out which was more parsimonious. I remember reading a discussion about this issue somewhere, but I cannot locate the references right now...

  2. One cannot reconnect nodes, ergo, move connections or trim excessive nodes. Reconnecting loose nodes can be implemented. However it's not easy for me to implement this at the time as it involves using the d3.js library which has a rather non-intuitive API (at least for me). Remember that many of these problems can be corrected later on using a vector drawing program (such as inkscape) editing the final SVG graph.

  3. The taxa I am working with exhibit IBD / IBT and the number of nodes needed to connect clades is on the order of 50-100. This looks like a horrible mess of spaghetti on screen. The interface lags noticeably. This may be a problem, and I would appreciate if you could provide more details. A lagging interface may be a consequence of many things... What do you mean by "lagging"? Is it the overall UI that does not respond? Does it take a while to arrange the haplotypes in a 2D space? What browser are you using? How many different haplotypes? In our case, with 250 different haplotypes, the overall response was quite good. Note that the forced graph is computed by d3.js, a third party library which I cannot modify or change (of course, if we spot a bug there, we should contact the developers).

  4. Support collapsing edges/nodes ala hash marks, or numeric? Ah, this would be a nice improvement, I agree. However, it is not trivial to implement, as it would need to change the internal structure of the graph and eventually store the remaining hidden parts elsewhere to be used (reopened) later on. It's something to be implemented if I find the time to do it. If anyone knows javascript and wants to code this, please do so! Everybody will appreciate the help.

  5. The detangling process doesn't go to completion; it reaches a stable state where charges balance out in loops. What I did figure out is decreasing charge to ~ -350, decreasing edge length, decreasing elasticity to ~80, and increasing gravity to 1.4. Then it detangled pretty well. It worked so well it might become a singularity if I increased gravity any further. I suppose that this is very case-specific. As far as I remember there is no one-algorithm-fits-all-cases in network layout optimizations. I opted for the force-directed graphs implemented in d3.js against other solutions offered by the same library, and explicitly exposed the "parameters" (gravity, charge, etc) of the algorithm because I knew that in some cases people would need to tweak them for their very specific scenarios! Apparently I made the right choice. For many of the networks we produce here the "default" parameters work quite well.

  6. Can you add gis support? :) Please enlighten me! What do you mean by GIS (Geographic Information Systems) support?

Best regards

Antonio