I have data of two conditions, coming from different experiments, that I have processed with SCT (separate for each library) and integrated with harmony. I now have a good mixing and clustering by cell type. I next want to find if any of the two conditions has more CNVs in a cell-type specific manner, so as reference I decided to use the average of all cells (since to be able to make a comparison between the two condition, CNVs have to be estimated against the same reference).
So my question is:
1) Since harmony does not alter the raw count matrix, I decided to use the SCT raw matrix (which corrects for sequencing depth) as input to inferCNV assuming that this might alleviate some of the batch effect. What is your opinion on that?
2) In case I run inferCNV in every run separately , then does it make sense to construct a reference from a random set of cells from the integrated dataset, and use this across all the inferCNV runs?
Hello, thanks for the great tool.
I have data of two conditions, coming from different experiments, that I have processed with SCT (separate for each library) and integrated with harmony. I now have a good mixing and clustering by cell type. I next want to find if any of the two conditions has more CNVs in a cell-type specific manner, so as reference I decided to use the average of all cells (since to be able to make a comparison between the two condition, CNVs have to be estimated against the same reference). So my question is: 1) Since harmony does not alter the raw count matrix, I decided to use the SCT raw matrix (which corrects for sequencing depth) as input to inferCNV assuming that this might alleviate some of the batch effect. What is your opinion on that? 2) In case I run inferCNV in every run separately , then does it make sense to construct a reference from a random set of cells from the integrated dataset, and use this across all the inferCNV runs?