Open JikaiShao opened 1 year ago
Hi Jack,
50k+ cells is a record :) Looking at the error I think the input size broke ape::nj
. You can try set skip_nj = TRUE
to skip neighbor-joining tree and always use UPGMA to initialize phylogeny.
Best, Teng
Hi Teng,
Thanks for your suggestions! I have run it using skip_nj = TRUE
and hope it will successfully go into the 2nd iteration :). Is there any potential difference between the default output and the output with the skipping of neighbor-joining tree construction?
Thank you very much!
Yours, Jack
NJ/UPGMA are two alternative ways to initialize the maximum likelihood phylogeny search via NNI. The default tries both and uses the tree with higher likelihood as initial tree. The end result phylogeny may differ with different starting points, but should be similar in most cases. For large datasets (>10k cells), we recommend only using UPGMA because it is faster than NJ, and NJ doesn't necessarily yield better tree.
Hi!
Thanks again for developing this tools!
I have successfully ran numbat on many tumor samples, including several large samples (20,000 - 30,000 cell) by reducing the number of cores (thanks for your suggestions!). However, I failed to run numbat on two extremely large sample (more than 50,000 cells) though I have set the 'ncores' and 'ncores_nni' to 4 and ran numbat on a workstation with more than 1 TB RAM.
Here is the output:
Currently I am trying to split the input DGE into several subsets (each with 10,000 to 20,000 cell) and then re-run numbat on each subset. I was wondering if you would be willing to give me some suggestions on this.
Thanks, Jack