single-cell-genetics / XClone

Detection of allele-specific subclonal copy number alterations from single-cell transcriptomic data.
https://xclone-cnv.readthedocs.io/en/latest/
Apache License 2.0
29 stars 3 forks source link

The final copy number calls #15

Closed Mashuaiii closed 1 month ago

Mashuaiii commented 1 month ago

@Rongtingting Hi,Rongtingting

Good job and I am very interested in your research, but I have some questions. I have known how to extracte the final CNV probability matrix (cellgene) for each CNV state in #7 . Howeve the methods mentioned above are separate and only the probability of CNV state, is there a way to directly get the final CNV state matrix(cell gene or cell * bins)?

I am looking forward to hearing from you and I would appreciate it very much. Best regards!

Rongtingting commented 1 month ago

Hi Moey,

Appreciate that you are interested in XClone.

As mentioned in #7, Xlayer = "prob1_merge" in combined module final output contains probability matrix for all CNV states (cell*gene*cnv state), the cnv state dimension in order of copy loss, loh, copy neutral and copy gain (in default). If you wanna directly get the final CNV state matrix, I would recommend you use the following codes:

import xclone
import scanpy as sc

import numpy as np
def assign_cna_states(cna_probs):
    """Assigns CNA states to cells based on the highest probability in a cell*gene*cnv state array.

    Args:
        cna_probs: A NumPy array of shape (num_cells, num_genes, num_cna_states)
                   representing the probabilities of each CNA state for each cell and gene.

    Returns:
        A NumPy array of shape (num_cells, num_genes) containing the assigned CNA states 
        for each cell and gene.
    """

    num_cells, num_genes, _ = cna_probs.shape
    cna_states = np.argmax(cna_probs, axis=2)  # Find the index of the highest probability state

    return cna_states

# Example usage:
data_dir = "xxxxxx/BCH869_scRNA_trials/"
combine_final_file = data_dir + "data/combined_final.h5ad"

combine_Xdata = sc.read(combine_final_file)
combine_Xdata.layers 
use_layer = "prob1_merge"

cna_probs = combine_Xdata.layers[user_layer]
assigned_states = assign_cna_states(cna_probs)
print(assigned_states)

assigned_states [0,1,2,3] stands for copy loss, loh, copy neutral and copy gain.

And the plotting for the CNA state also use the same idea:

xclone.pl.Combine_CNV_visualization(combine_Xdata, Xlayer = "prob1_merge", 
        cell_anno_key = plot_cell_anno_key,  
        title = fig_title,
        save_file = True, 
        out_file = combine_res_base_fig)

It is similar to getting independent RDR module or BAF module-derived states.

For the RDR module, use layer posterior_mtx in RDR_adata_KNN_HMM_post.h5ad to get the (cell * genes) CNA state index (copy loss, copy neutral, and copy gain).

For the BAF module, use layer posterior_mtx in BAF_merge_Xdata_KNN_HMM_post.h5ad to get (cell bins) CNA state index ("allele-A bias (++)", "allele-A bias (+)", "allele balance", "allele-B bias (+)" and "allele-B bias (++)") or layer add_posterior_mtx in BAF_merge_Xdata_KNN_HMM_post.h5ad to get (cell bins) CNA state index ("allele-A bias", "allele balance", and "allele-B bias").

For plotting, you can refer to plotting API

Thank you for your questions and feedback. We will incorporate this analysis in the next version of the release.

Mashuaiii commented 1 month ago

Hi Rongtingting, Thank you very much for your timely reply, and your answer is very detailed, I really appreciate it!