Additional function parameters / changed functionality / changed defaults?
Please describe your wishes
Hi,
Is there an equivalent function to multiBatchNorm in Python, or another method that can perform per-batch normalization?
My goal is to compute psuedobulk per indiviudal, Each individual sample has replicates that are processed across different libraries,
a- Simply summing the raw counts across replicates would likely introduce bias due to library-specific batch effects.
b- Taking the mean of normalized counts across replicates (scranPY normalized counts) doesn’t account for differences in size factors across the libraries, making normalization inconsistent between batches.
important note :
replicates are distributed across different libraries
Individual x might have replicate 1 in library 1 and replicate 2 in library 3, while
Individual y might have replicate 1 in library 1 but replicate 2 in library 4.
so thats why summing raw / normalized counts directly seem inaccurate
I’d greatly appreciate any advice.
In R, I’ve previously used multiBatchNorm from the scran package, which normalizes and scale the size factors within each batch to handle such batch effects. However, given the size of my current dataset, using R is not feasible.
What kind of feature would you like to request?
Additional function parameters / changed functionality / changed defaults?
Please describe your wishes
Hi,
Is there an equivalent function to multiBatchNorm in Python, or another method that can perform per-batch normalization?
My goal is to compute psuedobulk per indiviudal, Each individual sample has replicates that are processed across different libraries,
a- Simply summing the raw counts across replicates would likely introduce bias due to library-specific batch effects.
b- Taking the mean of normalized counts across replicates (scranPY normalized counts) doesn’t account for differences in size factors across the libraries, making normalization inconsistent between batches.
important note : replicates are distributed across different libraries
Individual x might have replicate 1 in library 1 and replicate 2 in library 3, while Individual y might have replicate 1 in library 1 but replicate 2 in library 4. so thats why summing raw / normalized counts directly seem inaccurate
I’d greatly appreciate any advice.
In R, I’ve previously used multiBatchNorm from the scran package, which normalizes and scale the size factors within each batch to handle such batch effects. However, given the size of my current dataset, using R is not feasible.