CCBR / CCBR_tobias

Tobias implementation for ATAC seq data.
MIT License
1 stars 1 forks source link

document recommended way to merge multiple consensus peak files from ASPEN to create one file for TOBIAS #9

Closed kelly-sovacool closed 7 months ago

kelly-sovacool commented 7 months ago

Are the default options from bedtools merge good enough for this use case, or are there other methods we should consider?

@kopardev

Krithika-Bhuvan commented 7 months ago

Found the answer on the TOBIAS FAQ (https://github.com/loosolab/TOBIAS/wiki/FAQ) and I quote below:

What peak-file should I use as input? You should use any .bed-file containing open chromatin regions from peak-calling, e.g. from MACS2 or similar. If you are planning to compare several conditions with each other, e.g. WT.bam with treatment.bam, you should obtain the peaks WT_peaks.bed and treatment_peaks.bed for each condition, and merge these using e.g. bedtools:

cat WT_peaks.bed treatment_peaks.bed | bedtools sort | bedtools merge > merged_peaks.bed

You should then use 'merged_peaks.bed' throughout the TOBIAS tools.

Krithika-Bhuvan commented 7 months ago

Here is a simple bash script in case anyone is interested

#!/bin/bash

#SBATCH --partition=norm
#SBATCH --job-name=tobias_merge_bed
#SBATCH --time=48:00:00
#SBATCH --cpus-per-task=4
#SBATCH --mem=80g
#SBATCH --gres=lscratch:200

### SETTINGS TO CHANGE

# enter path to consensus bed file(s) output from CCBR ASPEN
consensus_bed_dir="/data/CCRCCDI/analysis/ccrtegs4/atac/01_aspen/output3/results/peaks/genrich/"

# output folder
out_dir="/data/CCRCCDI/analysis/ccrtegs4/atac/09_tobias/"
OUTBED=$out_dir"merged_peaks.bed"

## STEPS - load bedtools module
module load bedtools
cd $consensus_bed_dir

echo "Found these files:"
ls *.genrich.consensus.bed

echo "Sort and merge these bed files:"
cat *.genrich.consensus.bed | bedtools sort | bedtools merge > $OUTBED

#call this script like this
#sbatch script_merge_bed_for_tobias.sh
kelly-sovacool commented 7 months ago

Perfect, thank you Krithika!