theMILOlab / SPATA

SPATA Package for spatial gene expression analysis
21 stars 1 forks source link

CNV analysis and reference data #12

Closed nesilin closed 3 years ago

nesilin commented 3 years ago

Hi!

First of all congratulations! SPATA2 seems like a very useful tool. In fact, I'm interested in performing a CNV analysis on my 10X Visium data. Here are some questions that I have:

1) What does ref_mtr parameter from the runCnvAnalysis function expect? From the explanation of the documentation is not clear to me. In a cancer study scenario, would ref_mtr = "data-reference/mtr.RDS" be some normal-adjacent tissue that has also been analyzed with 10x Visium? Is it then okay to subset the spots from the same slide into tumor-area ones and "normal" ones and use the latest as reference?

2) Can the reference matrix be from a matching scRNA-seq experiment of the same samples?

3) To what extend your reference from the example can be used as a general reference for other projects? I'm aware that the example data of the tutorials is not available until you publish (as explained here) but could you please provide me the reference (mtr.RDS)?

Thanks!!!

heilandd commented 3 years ago

Thank you for your interest. mtr.RDS can also be a count matrix with genes as rownames and spots as colnames. We recommend not to mix single cell data and spots for CNV analysis. It is better to use non-malignant samples, which can either be a ctr sample or extracted from non-malignant areas of your sample (using the segmentation tool).

To question 1: Is it then okay to subset the spots from the same slide into tumor-area ones and "normal" ones and use the latest as reference?

To question 2: Can the reference matrix be from a matching scRNA-seq experiment of the same samples?

For brain, both approaches work similarly. If you use brain we can provide a reference dataset: https://github.com/theMILOlab/SPATA2/blob/master/data/Ref.rda (temporal lobe non-malignant sample Epi-surgery)

To 3: To what extend your reference from the example can be used as a general reference for other projects?

download.file("https://github.com/theMILOlab/SPATA2/blob/master/data/Ref.rda?raw=true", "Ref.rda") mat <- load("Ref.rda") mat$Counts %>% dim() mat$annotations_file %>% head(5)

The RunCNV would look like this:

object <- SPATA2::runCnvAnalysis(object, directory_cnv_folder = getwd(), ref_mtr = mat$Counts, ref_annotation = mat$annotations_file)

Hope I could help.

SPATA Team

kueckelj commented 3 years ago

Hello @nesilin, I have updated SPATA2. The reference data @heilandd refered to with the code chunk

download.file("https://github.com/theMILOlab/SPATA2/blob/master/data/Ref.rda?raw=true", "Ref.rda") mat <- load("Ref.rda") mat$Counts %>% dim() mat$annotations_file %>% head(5)

is now part of the SPATA2 namespace. If you install SPATA2 again you should be able to access it via SPATA2::cnv_ref. It is a list that contains the reference data the function runCnvAnalysis() uses by default. Check out its content manually. This probably helps to understand what the reference arguments need.

Additionally, I have updated the tutorial on cnv analysis as well as the documentation of the function runCnvAnalysis(). I hope that helps. The cnv analysis part of SPATA2 is still in development. Therefore, if you encounter any errors please let us know.

Hint: The webpage mistakenly used to lead to this forum (forum of SPATA) to open issues for SPATA2. The link is now updated. Please open new issues here.

Best regards

Jan (SPATA Team)

hzongyao commented 3 years ago

Hi I plan to use package related ref data as reference of inferCNV,I wonder that how to incite this data ? Thank you!

heilandd commented 3 years ago

You can cite the references dataset as part of our latest preprint: https://www.biorxiv.org/content/10.1101/2021.02.16.431475v1