neurogenomics / EpiCompare

Comparison, benchmarking & QC of epigenetic datasets
https://doi.org/doi:10.18129/B9.bioc.EpiCompare
13 stars 3 forks source link

Create consensus peaks #93

Closed bschilder closed 2 years ago

bschilder commented 2 years ago

Group multiple files in a peak list into consensus peak files. This will both help reduce the number of samples being plotted (especially important for single-cell data), and make peak files more comparable to "Replicated peak" files generated by ENCODE from multiple replicates. @NathanSkene

bschilder commented 2 years ago

Now implemented as compute_consensus_peaks using one of two strategies:

  1. "granges": Simple (and fast) consensus peak identification using overlap.
  2. "consensusSeekeR": More complex (slower, but more accurate) version using the additional dependency consensusSeekeR, which does some modelling to approximate the range of "true" peaks. Can read more here. consensusSeekeR can also be sped up using parallelization.

By default, all GRanges object in the named list are used to compute consensus peaks. However, users can also supply the groups argument which lets them compute sets of consensus peaks based on element groupings (e.g. C&T, C&R, or whatever other grouping you want).

I think it makes the most sense to keep this as a prestep before running EpiCompare::EpiCompare. That way users can inspect the consensus peaks, and tweak the hyperparameters if needed, before proceeding.

Full documentation found here: https://neurogenomics.github.io/EpiCompare/reference/compute_consensus_peaks.html