YosefLab / Cassiopeia

A Package for Cas9-Enabled Single Cell Lineage Tracing Tree Reconstruction
https://cassiopeia-lineage.readthedocs.io/en/latest/
MIT License
75 stars 24 forks source link

Does Cassiopeia have the Function to directly get fitness score? #244

Closed tenlives closed 2 months ago

tenlives commented 2 months ago

Hi, Thanks for developling the very cool tool. Does the cassiopeia have the function to get the fitness score from the tree?

colganwi commented 2 months ago

Cassiopeia does not implement single-cell fitness estimation. We recommend using infer_fitness from the jungle package

mattjones315 commented 2 months ago

HI @tenlives,

Sorry for the confusion - we do actually have this implemented in Cassiopeia, under the tools module. You can invoke it as so:

fitness_estimator = cas.tools.fitness_estimator.LBIJungle()
fitness_estimator.estimate_fitness(tree)

You can then normalize the fitnesses to the maximum as so:

fitnesses = np.array([tree.get_attribute(cell, 'fitness') for cell in _tree.leaves])
fitnesses /= np.max(fitnesses)

Please note that the algorithm utilizes the branch lengths stored in the tree. The approach we've used in our work is to set edge lengths without any mutations to a length of 0, and any edge with mutations to a length of 1. You can also check out the branch length estimators we've implemented.

Hope this helps, Matt

tenlives commented 2 months ago

HI @tenlives,

Sorry for the confusion - we do actually have this implemented in Cassiopeia, under the tools module. You can invoke it as so:

fitness_estimator = cas.tools.fitness_estimator.LBIJungle()
fitness_estimator.estimate_fitness(tree)

You can then normalize the fitnesses to the maximum as so:

fitnesses = np.array([tree.get_attribute(cell, 'fitness') for cell in _tree.leaves])
fitnesses /= np.max(fitnesses)

Please note that the algorithm utilizes the branch lengths stored in the tree. The approach we've used in our work is to set edge lengths without any mutations to a length of 0, and any edge with mutations to a length of 1. You can also check out the branch length estimators we've implemented.

Hope this helps, Matt

Thanks for your kind reply, I have the tree with branch lengths calculated from other tools, according to your approach ,"fitness_estimator = cas.tools.fitness_estimator.LBIJungle(), fitness_estimator.estimate_fitness(tree)", these commands set edge lengths to 0 or 1, I don`t need to set it again,right? And I have many tree files, could the tree file list as a input for "fitness_estimator.estimate_fitness" ?

mattjones315 commented 2 months ago

Hi @tenlives,

No - this function will not assume edge lengths are 0 or 1, you must set these explicitly or expect the procedure to use the branch lengths provided in the tree.

And no, you cannot pass a tree list to this function, it takes exactly one tree as input. If you want to do a list, here's some pseudocode to get you started:

fitness_estimator = cas.tools.fitness_estimator.LBIJungle()
for _tree in trees:
    fitness_estimator.estimate_fitness(_tree)
    fitnesses = np.array([tree.get_attribute(cell, 'fitness') for cell in _tree.leaves])
    fitnesses /= np.max(fitnesses)
    _tree.cell_meta['fitness'] = fitnesses

Best, Matt

tenlives commented 2 months ago

Hi @mattjones315 , My data is also the barcode mutation data, if I want to set edge lengths 0 or 1 according to you approach, could you tell me how to use Cassiopeia to set it?

mattjones315 commented 2 months ago

Hi @tenlives,

This is a code snippet that I've used, it could be helpful:

tree.reconstruct_ancestral_characters()
tree.set_character_states(tree.root, [0] * tree.n_character)

for edge in tree.depth_first_traverse_edges():
        branch_length = len(tree.get_mutations_along_edge(edge[0], edge[1]))
        branch_length = min(1, branch_length) # you can remove this if you want to keep branch lengths to be the number of mutations
        tree.set_branch_length(edge[0], edge[1], branch_length)
tenlives commented 2 months ago

Hi @tenlives,

This is a code snippet that I've used, it could be helpful:

tree.reconstruct_ancestral_characters()
tree.set_character_states(tree.root, [0] * tree.n_character)

for edge in tree.depth_first_traverse_edges():
        branch_length = len(tree.get_mutations_along_edge(edge[0], edge[1]))
        branch_length = min(1, branch_length) # you can remove this if you want to keep branch lengths to be the number of mutations
        tree.set_branch_length(edge[0], edge[1], branch_length)

Thanks, I have tried it, I found that I should run "tree.set_character_states_at_leaves(character_matrix=character_matrix)" before you code snippet, and it works!

tenlives commented 2 months ago

Does it make sense that I set the node branch length to 1 directly?

mattjones315 commented 2 months ago

I think this will introduce artifacts to your analysis- namely, giving weight to edges that don’t carry any mutations.

On Fri, Jul 12, 2024 at 10:45 PM tenlives @.***> wrote:

Does it make sense that I set the node edge length to 1 directly?

— Reply to this email directly, view it on GitHub https://github.com/YosefLab/Cassiopeia/issues/244#issuecomment-2226784555, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKWFHUSZHHX3GSRH5PRR5LZMC5IVAVCNFSM6AAAAABKYOVJBOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRWG44DINJVGU . You are receiving this because you were mentioned.Message ID: @.***>