navinlabcode / copykat

Other
193 stars 54 forks source link

Aneuploidy in all stromal cells #18

Closed s849 closed 2 years ago

s849 commented 3 years ago

Hi,

Thank you for developing this great tool. I ran copyKAT on my tumor samples and I found the opposite results of what I was expecting. All stromal cells - fibroblasts, immune, melanocytes, were labeled as aneuploid, and all epithelial cells were labeled as diploid. Do you know why this may be the case? How can I solve this problem? I really want to confidently identify these putative malignant cells using this method!

s849 commented 3 years ago

As a follow up to the previous message, I included a vector of normal cells, but even with this approach, the stromal cells keep being labeled aneuploid and epithelial cells diploid. This is how I am inputing the vector of normal cells:

norm.cells <- as.vector(normal)

copykat.test <- copykat(rawmat=exp.rawdata, id.type="S", ngene.chr=1, win.size=15, KS.cut=0.1, sam.name="test", cell.line = "no", distance="spearman", norm.cell.names="norm.cells", n.cores=2)

Any ideas?

Thank you!

guanxn90 commented 3 years ago

Hi,

Thank you for developing this great tool. I ran copyKAT on my tumor samples and I found the opposite results of what I was expecting. All stromal cells - fibroblasts, immune, melanocytes, were labeled as aneuploid, and all epithelial cells were labeled as diploid. Do you know why this may be the case? How can I solve this problem? I really want to confidently identify these putative malignant cells using this method!

@s849 When I use all cells from a sample (prepared from whole tumor), I always get this error:

[1] "low confidence in classification" Error in apply(rawmat[which(rownames(rawmat) %in% c("PTPRC", "LYZ", "PECAM")), : dim(X) must have a positive length

When I only use epithelial cells, it runs fine. There seems to be some difference between inferCNV and copykat. I kind of would say cluster 0 is tumor and cluster 1 is normal epithelial, based on the inferCNV heatmap.

image

gaobio commented 3 years ago

Hi, Thank you for developing this great tool. I ran copyKAT on my tumor samples and I found the opposite results of what I was expecting. All stromal cells - fibroblasts, immune, melanocytes, were labeled as aneuploid, and all epithelial cells were labeled as diploid. Do you know why this may be the case? How can I solve this problem? I really want to confidently identify these putative malignant cells using this method!

@s849 When I use all cells from a sample (prepared from whole tumor), I always get this error:

[1] "low confidence in classification" Error in apply(rawmat[which(rownames(rawmat) %in% c("PTPRC", "LYZ", "PECAM")), : dim(X) must have a positive length

When I only use epithelial cells, it runs fine. There seems to be some difference between inferCNV and copykat. I kind of would say cluster 0 is tumor and cluster 1 is normal epithelial, based on the inferCNV heatmap.

image

Hi s849 and guanxn90, when you saw this warning message "low confidence in classification", copykat actually failed to predict tumor and normal cells automatically. Most likely, you tumor cells do not have strong aneuploidy. The last step it tries is to find out some non-epithelial cells using the three gene marker as normal cell control (this is not part of the copykat algorithm). Your messages showed that your data might not contain more than one of these markers.

guanxn90 commented 3 years ago

@s849 When I use all cells from a sample (prepared from whole tumor), I always get this error:

[1] "low confidence in classification" Error in apply(rawmat[which(rownames(rawmat) %in% c("PTPRC", "LYZ", "PECAM")), : dim(X) must have a positive length

When I only use epithelial cells, it runs fine. There seems to be some difference between inferCNV and copykat. I kind of would say cluster 0 is tumor and cluster 1 is normal epithelial, based on the inferCNV heatmap. image

Hi s849 and guanxn90, when you saw this warning message "low confidence in classification", copykat actually failed to predict tumor and normal cells automatically. Most likely, you tumor cells do not have strong aneuploidy. The last step it tries is to find out some non-epithelial cells using the three gene marker as normal cell control (this is not part of the copykat algorithm). Your messages showed that your data might not contain more than one of these markers.

@gaobio Thank you so much. I run the same sample (~10k cells, whole tumor) using inferCNV and copyKat. This is prostate tumor sample, so the CNV burden will be low; however, inferCNV plot does indicate some clusters having CNVs. I check the three gene expression in the data (~10k, all cells) and there are cells expressing these markers. image . However if I use all cells as input, it shows the 'low confidence in classification' error. On the other hand, if I only use epithelial cells (out of the 10k all cells) and run copyKat, it runs through (the plot titled as "copyKat run using only epithelial cells"). Thanks again.

s849 commented 3 years ago

Ok, thank you for your input. I want to try this once again while inputting non-epithelial cell as reference. Can you please let me know how to obtain and input the norm.cell.names vector required from a Seurat object?

gaobio commented 3 years ago

Ok, thank you for your input. I want to try this once again while inputting non-epithelial cell as reference. Can you please let me know how to obtain and input the norm.cell.names vector required from a Seurat object?

You just put the cell barcodes of normal cells into a vector, let's call it as NameNorm. CopyKAT has an option to take this input, i.e. copykat(...., norm.cell.names=NameNorm, ...). Another parameter you can try is the KS.cut. If you did not see many CNAs, tuning down this value may help but not guaranteed. You may try to change KS.cut down to 0.05