Finding trunk and branch clusters from json files

Hi there --

Assuming you've run Pairtree and obtained an .npz file, this .npz file has the information for all of the tree structures found as well as the cluster information. This Pairtree documentation section may be helpful: https://github.com/morrislab/pairtree#description-of-the-resultsnpz-file-format

Here's a quick example in Python for how you could use the results in the .npz file to find all of the clusters between the root of the tree and the first branching event:

import json
import numpy as np

data = np.load("path/to/npz") # load the npz file
tree_0 = data["struct"][0] # extract the parents vector for the best tree found
clusters = json.loads(data["clusters.json"]) # extract the clustering information for later use

n = 0 # starting from the root node 
while sum(tree_0 == n) == 1: # while the current node doesn't have multiple children
    n = tree_0[tree_0 == n][0] + 1 # get the child node of the current node
    print("Cluster %d occurred before the first branching event" % (n-1))

It should be easy enough to translate this to R and modify it for your purposes.

morrislab / pairtree

Finding trunk and branch clusters from json files #35