I have a couple of question that pertain to the statistics included in the method ..
Is there a motivation to use the log-linear model for the mean parameter estimation?
You have reported before that the library size confounds the biology in spatial here. Also it was reported here and discussed in this paper that normalization may actually reverse correlation between certain genes. I have read the methods part repeatedly but it's still unclear to me how are you accounting for the region-specific effect and that removing it all together gives you the biology-specific effect, even though it was mentioned before that the region-specific effect might be due to the biology. Excuse me for the long question
We use Negative Binomial (NB) distribution for modelling the gene count within GLM framework. The log function is the most natural link function for mean parameter and has been used previously by others e.g. edgeR package for modelling RNA-seq data
In SpaNorm, we have two region-specific effects represented by the two splines. The first spline that's not related to log library size (LS) contains the biology while the second spline contains the region-specific effect due to LS. When we adjust the data we remove the effect due to the second type of region-specific effect but keeping the first type of region-specific effect due to biology.
Hi, Thanks for this great work,
I have a couple of question that pertain to the statistics included in the method ..