brentp / mosdepth

fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing
MIT License
698 stars 100 forks source link

sorting issue #177

Closed mpalatucci closed 2 months ago

mpalatucci commented 2 years ago

Hi all I was hoping to find help here.

I am doing some ATAC_Seq analysis.

I am trying to compare peak callers Genrich (atac-seq specific) and MACS2 for the same dataset From what I have read, because I'm dealing with paired end reads, it makes the most sense to sort my final bam files by name. In addition, Genrich requires you to sort the bam files by name for the peak calling.

Due to the sorting by name, I'm running into issues with differential peak calling with diffbind, which requires the bam files to be sorted by coordinates.

soo... I have a couple questions i'm hoping to receive some clarification on:

  1. is it ok if my bam files for diffbind are sorted by coordinate but my peaks were called by name sorting?
  2. are there any discrepancies this differential sorting of bam and peak files may cause that I should be aware of ?

any help greatly appreciated!

brentp commented 2 years ago

In order to use mosdepth, your data must be coordinate sorted. I don't know about peak calling, that would depend on the tool that's used for that application.