morrislab / pairtree

Pairtree is a method for reconstructing cancer evolutionary history in individual patients, and analyzing intratumor genetic heterogeneity. Pairtree focuses on scaling to many more cancer samples and cancer cell subpopulations than other algorithms, and on producing concise and informative interactive characterizations of posterior uncertainty.
MIT License
37 stars 11 forks source link

Get trunk and branch clusters #53

Closed itigupta2429 closed 1 month ago

itigupta2429 commented 1 month ago

Hi Team, I wanted to get the trunk and branch wise cluster information from the .npz file. For Example:

image

In this case: Trunk = clusters 0 to 8 branch1 = cluster 9 branch2 = cluster 10 The tree structure in .npz file looks like: tree_0 Out[2]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 8]) Could you please help me with the same?

ethanumn commented 1 month ago

It seems like you're looking for something like the answer in #35

Alternatively, Orchard (https://github.com/morrislab/orchard/) has some built-in functionality to do this type of analysis. There's a function that will split apart a tree into all of its branching events.

You could either install this package or just copy and paste the function that does this located in /orchard/lib/cluster/utils.py

import os, sys
import numpy as np

sys.path.append("/path/to/orchard", "lib", "cluster"))
from utils import dfs_find_lineages

parents = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 8])
branches, _ = dfs_find_lineages(parents)

print("Trunk", branches[0])
for i in range(1, len(branches)):
    print(f"Branch {i} = ", branches[i])

This will output the following:

Trunk [1, 2, 3, 4, 5, 6, 7, 8]
Branch 1 =  [10]
Branch 2 =  [9]
itigupta2429 commented 1 month ago

@ethanumn Thanks a lot for your quick response. It Worked!