talgalili / dendextend

Extending R's Dendrogram Functionality
153 stars 29 forks source link

Unbranch keep original ordering #87

Closed samperochkin closed 5 years ago

samperochkin commented 6 years ago

For sub2 = (sub21, ..., sub24)

unbranch( dend=(sub1, sub2, sub3) , 2 ) = (sub21, ..., sub24, sub1, sub3)

would now be

unbranch( dend=(sub1, sub2, sub3) , 2 ) = (sub1, sub21, ..., sub24, sub3).

samperochkin commented 6 years ago

When we face the situation dend = (leaf1, leaf2, subdend, leaf3) and unbranch(dend,3) = (leaf1, leaf2, new.leaf1, new.leaf2, leaf3)

midpoint = mean(midpoints) returns NA and plotting the unbranched dend produces something weird. In this case, I simply defined the midpoint as the original. Again, not sure you (@talgalili) want this. Maybe I should have separated this concern from the "mixed branches" issue.

codecov-io commented 6 years ago

Codecov Report

Merging #87 into master will decrease coverage by 0.01%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #87      +/-   ##
==========================================
- Coverage   37.46%   37.44%   -0.02%     
==========================================
  Files          51       51              
  Lines        3614     3613       -1     
==========================================
- Hits         1354     1353       -1     
  Misses       2260     2260
Impacted Files Coverage Δ
R/unbranch.R 81.72% <100%> (-0.2%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update f331c51...0ff2c5c. Read the comment docs.

talgalili commented 5 years ago

Hi @samperochkin Sorry for taking this time to respond! Can you please provide a minimal example showing the impact of your code? (if this could be added as a test, that would even be better)

This would help me understand your change and approve it.

Thanks!

talgalili commented 5 years ago

Hi again @samperochkin , any chance you could add code demonstrating your changes and/or tests?

samperochkin commented 5 years ago

Yes of course, I provide that very soon. Can you please explain me what you mean by "if this could be added as a test, that would even be better"?

talgalili commented 5 years ago

You'll need to add examples here: https://github.com/talgalili/dendextend/blob/master/tests/testthat/test-unbranch.R Read here to learn about testthat: https://testthat.r-lib.org/

samperochkin commented 5 years ago

I've done just that. It's a shame I didn't know about testthat, great tool.

The "problem" I encountered is that the new roots (unbranched) always get placed in slot dend[[1]] and dend[[2]] rather "staying where they were". I think this explains what I mean:

hc <- hclust(dist(USArrests[10:13,]), "ave")
dend <- as.dendrogram(hc)
unbranched_dend <- unbranch(dend,1)
unbranched_dend_2 <- unbranch(unbranched_dend,3)

get_leaves_attr(dend,"label")
get_leaves_attr(unbranched_dend_2,"label") # order of leaves is not preserved

The new code leaves the order unchanged.

Note that this happens only when branch_becoming_root != 1.

talgalili commented 5 years ago

Perfect. Now I understand what you did, thanks :) Last request before merging, could you please update:

1) https://github.com/talgalili/dendextend/blob/master/NEWS (bump the version to 1.11.0, use current date, and explain your recent change. I'd consider this a bug fix)

2) https://github.com/talgalili/dendextend/blob/master/DESCRIPTION (add your details at the end as a contributor, and bump the version to 1.11.0 and update the date)

Thanks :)