Open bontus opened 5 years ago
It depends on what you exactly call batch effects. There is some accounting for that in binless, by default. However, could you maybe explain how you see the batch effect, i.e. does it affect the diagonal decay, the biases etc?
The decay values are indeed different (i.e. smaller in batch 3 compared to batch 1&2), and I mainly noticed the differences in downstream calculations when looking at TAD borders and compartment strength. However, I realize that my question was somewhat arbitrary as I am mostly interested in accounting for batch effects during the difference test implemented in binless. Basically, my question could be translated to: can _detect_binlessdifferences() use pairing information (akin to a paired t-test)? _read_andprepare()_ does provide the replicate parameter, but I did not see any other function make use of it. Best
In general, detect_binless_differences
pairs the samples, so acts like a paired t-test, albeit more complicated because it takes into account the neighborhood of each pixel. In that sense, batch effects are already accounted for.
The replicate parameter in read_and_prepare
serves essentially to have a different name for each sample. If you want to model a different decay, you could adapt the condition or enzyme fields of read_and_prepare
, and then play with the different.decays argument of merge_cs_norm_datasets
Also, in difference detection, did you group your datasets before, or did you call differences in each dataset individually?
Also, in difference detection, did you group your datasets before, or did you call differences in each dataset individually?
I grouped them after normalization and before calling _detect_binlessinteractions().
The replicate parameter in
read_and_prepare
serves essentially to have a different name for each sample. If you want to model a different decay, you could adapt the condition or enzyme fields ofread_and_prepare
, and then play with the different.decays argument ofmerge_cs_norm_datasets
Alright, I will give that a try.
In general,
detect_binless_differences
pairs the samples, so acts like a paired t-test, albeit more complicated because it takes into account the neighborhood of each pixel. In that sense, batch effects are already accounted for.
That's great to hear, but I am still wondering which information is used to pair the samples if it is not explicitly provided by the user?
I am still wondering which information is used to pair the samples if it is not explicitly provided by the user?
For difference detection, data is grouped by square bins of size base.res, and compared two by two, taking into account neighbour information. That is done automatically, and does not require user input. A more stricter pairing, in the sense of a patient before and after treatment, would not make sense anyway in this context.
Hi, I was wondering if there is a way to include batches in the binless analyses. I have 4 conditions, of which two have 3 replicates (control & treatment) and two have only 2 replicates (control + inhibitor as well as treatment + inhibitor). We are interested in detecting differences induced by treatment and dependent on the inhibitor but already noticed that one of our replicate batches clusters separately (globally the same changes are still visible though). Any advice is greatly appreciated! Best regards