AntonioDeFalco / SCEVAN

R package that automatically classifies the cells in the scRNA data by segregating non-malignant cells of tumor microenviroment from the malignant cells. It also infers the copy number profile of malignant cells, identifies subclonal structures and analyses the specific and shared alterations of each subpopulation.
https://www.nature.com/articles/s41467-023-36790-9
GNU General Public License v3.0
92 stars 25 forks source link

Extracting CNV matrix for downstream analysis #7

Closed grst closed 2 years ago

grst commented 2 years ago

Hi,

I would like to extract the data used for plotting the for some downstream analysis (custom visualization and computation of a heterogeneity score).

The only full cell x feature matrix I could find in the results is the mtx_vega.txt matrix. However when plotting it as a heatmap, the results are rather inconsistent:

image

(tumor/normal classification is the result from SCEVAN)

Is there a way to retrieve the "final" SCEVAN results somehow?

Cheers, Gregor

CC @abyssum

AntonioDeFalco commented 2 years ago

Hi, Yes the CNV matrix was not saved in the output and the _mtxvega.txt file is an intermediate pipeline file. I added in the last commit 8d91777 the saving of the matrix that you will find in the output folder in the file CNAmtx.RData.

Thanks for your interest in our tool. Regards

grst commented 2 years ago

That was amazingly quick! Thanks!

grst commented 2 years ago

Hi @AntonioDeFalco,

I now ended up with something that looks like this: image

which is a lot closer to the SCEVAN heatmap. However, the SCEVAN heatmap seems to have contain some "segmentation" information on top. Is that available somewhere as well?

AntonioDeFalco commented 2 years ago

Hi @grst, If you mean the gray and black colors at the top they indicate the different chromosomes not segmentation information. Instead you can find information about segmentation in the files in the output folder ending with vega output.txt, one file (only tumor) with the segmentation of all tumor cells and one file for each subclone with the respective segmentation.

grst commented 2 years ago

I was actually referring to the segments indicated with blue lines here:

image

I suspect this is the same segmentation as in the vega output.txt you were mentioning. To reproduce the heatmap, do I simply need to calculate the mean of the CNA matrix for each cell in each vega segment?

Many thanks and enjoy the holidays! Gregor

grst commented 2 years ago

Hi @AntonioDeFalco,

could you please take a look at my last question? I suspect it got lost over the holidays.

Many thanks, Gregor

AntonioDeFalco commented 2 years ago

Hi @grst, exactly the saved matrix was missing the step of calculating the mean in the segmentation segments. I have corrected it in the last commit ab634a5, now the matrix saved in the CNAmtx.RData file corresponds exactly to the SCEVAN heatmap.

Sorry for the late reply. Regards

grst commented 2 years ago

Nice! I now can reproduce the tumor/normal heatmap exactely! image

However, the when subsetting it to the subclones, it still looks slightly different. It looks like there is a different segmentation information used for the subclones heatmap: image

AntonioDeFalco commented 2 years ago

As explained in the preprint, after clustering the CNA matrix to find the subclones, each cluster is segmented independently and analysed separately to then obtain the subclone heatamp. Last commit de6a350, save in CNAmtxSubclones.RData this heatmap.

AntonioDeFalco commented 2 years ago

@grst I made a mistake in the last commit which caused an error as reported by another user and I have just corrected the problem in the commit. de6a350.

grst commented 2 years ago

Everything works as expected now! Thanks for your excellent support!

Pentayouth commented 6 months ago

Countless softwares have ceased updating shortly after publication. It's very exciting to see the author promptly optimizing such an important and easy-to-use software. All best wishes!