Use for multiple sample differential analysis

Hi - my apologies for the delay in response.

I am not sure if I correctly understand your experimental design. In any design, you should expect to have biological and/or technical replicates in each experimental condition.

If you are interested in comparing multiple treated conditions vs. multiple control conditions in a grand experiment, such as Drugs A/B/C vs. vehicle 1/2/3 or similar, then I would suggest finding the set of total peaks that are [reproducibly] found in at least 1 condition. Then use that as your universe peak set for differential analysis. In this case, you should still have more than 1 observation (assay replicate) per experimental condition to permit variance estimation --> statistical testing. As an example, if you have 3 treated conditions and 3 control conditions, at n=4 per condition, then you would have 24 total assay samples.

If you are interested in comparing multiple treated samples vs. multiple control samples, my interpretation is that this could mean one treated group vs. one control group. This is how I have described our chief mouse ATAC-seq data set in the paper: two treatment groups with n=2 biological replicates per group, so 4 samples in total.

In any case, I think it is a general practice to take all samples in your core experimental design and identify a set of possible peaks that are found in at least 1 of the experimental conditions. This could be through the naive overlap peak set strategy outlined in the repo, or other related strategies (e.g. intersecting replicate peak coordinates).

Hope this helps!

reskejak / ATAC-seq

Use for multiple sample differential analysis #14