Open raboul101 opened 6 years ago
Hello,
We prefer to use the tagAlign files, which are BED formatted files of the reads (ie, each line in the file is a different read). We then use bedtools to intersect the reads with the union file, using the -c
option, to count how many reads overlap with each region. This keeps things generally simple for the most part.
Also, if you haven't had a chance yet, we have a pipelines google group that may also be helpful to you: https://groups.google.com/forum/#!forum/klab_genomic_pipelines_discuss
I have another question about preferred ways to get from pipeline results to diffE analysis.
You previously commented, in ref. to differential peak 'expression': -- You can use the union of naive overlap peaks across all conditions as your complete set of peaks. Quantify read counts in each peak in each of the replicates and treatments. --
To create a union of naive overlaps, I took the naive_overlap.filt.narrowPeak.bb files from /out/peak/macs2/overlap/optimal_set. I converted these to .bed using UCSC bigBedToBed. Then I used bedops --everything to create the union.bed file. I intend to quantify reads using featureCounts, but I'm having trouble in converting the merged .bed to a .gtf or .saf in order to count the reads.
Do you have a preferred way of doing this?