YuLab-SMU / treeio

:seedling: Base Classes and Functions for Phylogenetic Tree Input and Output
https://yulab-smu.top/treedata-book/
94 stars 24 forks source link

as.phylo C stack overflow caused by deep trees #78

Open teng-gao opened 2 years ago

teng-gao commented 2 years ago

Hi,

When converting from igraph to phylo, I run into the issue of C stack overflow for very deep trees.

Reproducible example: tree_final_1.rds.zip

> tree = readRDS('tree_final_1.rds')
> treeio::as.phylo(tree)
Error: C stack usage  7971156 is too close to the limit

This is probably similar to the ape issue that is address here: https://github.com/emmanuelparadis/ape/issues/54

Would really appreciate your help!

Thanks, Teng

evanbiederstedt commented 2 years ago

Relevant code:

https://github.com/YuLab-SMU/treeio/blob/master/R/method-as-phylo.R#L1-L3

> tree = readRDS("tree_final_1")
> ape::as.phylo(tree)
Error in UseMethod("as.phylo") : 
  no applicable method for 'as.phylo' applied to an object of class "c('tbl_graph', 'igraph')"
> treeio::as.phylo(tree)
Error: C stack usage  7969664 is too close to the limit
In addition: Warning message:
The `x` argument of `as_tibble.matrix()` must have unique column names if `.name_repair` is omitted as of tibble 2.0.0.
Using compatibility `.name_repair`.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.

Somewhere there is very deep recursion happening to be fixed.

teng-gao commented 2 years ago

I remember tracing the error back to this line https://github.com/YuLab-SMU/treeio/blob/master/R/method-as-phylo.R#L28-L31 Most likely occurred in treeio:::.write.tree4.

evanbiederstedt commented 2 years ago

@xiangpin

Relevant issue here: https://github.com/kharchenkolab/numbat/issues/30

At some point, there's a recursive function which recurses too deeply, creating a stack overflow.

It's possible the fix belongs in ape...I haven't explored this properly yet.

teng-gao commented 2 years ago

Just following up on this - do you think this is an ape issue or treeio issue? This fix in ape last week that seems relevant: https://github.com/emmanuelparadis/ape/issues/53

evanbiederstedt commented 2 years ago

Yes, I'm working with Teng above---we are working on this package: https://github.com/kharchenkolab/numbat

The dependencies are ape and treeio. We're trying to figure out where the error is and how to best fix it. We should really CC the relevant ape authors: @KlausVigo @emmanuelparadis

I'm happy to put in time for a PR to fix these issues, but (like Teng) I'm trying to figure out where to begin.

Everyone's insight here would be really helpful.

Thanks, Evan

AntoniaChroni commented 2 years ago

Hey,

I am getting an error when trying to plot the heatmap while running Numbat: Mem used: 5.84Gb INFO [2022-06-17 00:24:06] Using 3 CNVs to construct phylogeny INFO [2022-06-17 00:25:14] Using UPGMA tree as seed.. INFO [2022-06-17 00:25:16] Mem used: 5.84Gb INFO [2022-06-17 00:25:25] Iter 2 -2575.81224322793 9.6s INFO [2022-06-17 00:25:36] opt_move:17a->9d, cost=886 INFO [2022-06-17 00:25:46] Found 2111 normal cells.. WARN [2022-06-17 00:25:50] Plotting phylo-heatmap failed, continuing.. INFO [2022-06-17 00:25:50] All done!

BTW I managed to infer the heatmap and phylogeny in another dataset with 2 CNVs:

Mem used: 9.9Gb INFO [2022-06-02 05:09:48] Using 2 CNVs to construct phylogeny INFO [2022-06-02 06:13:03] Using UPGMA tree as seed.. INFO [2022-06-02 06:13:05] Mem used: 9.9Gb INFO [2022-06-02 06:14:27] Iter 2 -3793.82482404086 82s INFO [2022-06-02 06:15:45] opt_move:5b->8a, cost=2820 INFO [2022-06-02 06:17:05] Found 2359 normal cells.. INFO [2022-06-02 06:17:42] All done!

Please notice that the cost value is significantly lower in the first case. Is this causing the problem?'

I'd appreciate any insights on this one!

Best, Tonia

teng-gao commented 2 years ago

Hi @GuangchuangYu,

I think the above error is caused by as.phylo (as reported in my original issue). Please let me know if there would be a fix soon!

Thanks, Teng

GuangchuangYu commented 2 years ago

@xiangpin do you have some times to look into this?

xiangpin commented 2 years ago

I think the problem is caused by the internal function check_edgelist and .write.tree4. I have modified the check_edgelist, but the .write.tree4 (seems to be caused by the recursive) might need some time.

teng-gao commented 2 years ago

I think the problem is caused by the internal function check_edgelist and .write.tree4. I have modified the check_edgelist, but the .write.tree4 (seems to be caused by the recursive) might need some time.

Thanks! A similar issue may have been fixed in ape (although haven't looked too closely): https://github.com/emmanuelparadis/ape/commit/defb6d2d1300e4173b22d4ea73d3eaab6ea06fb8

teng-gao commented 2 years ago

Hi @xiangpin,

Just wanted to follow up on this. Thanks in advance!

Best, Teng