I attempted to manually compress ref_peak.bed into ref_peak.bed.gz using gzip. Subsequently, when running looper runp, I successfully obtained a count table(#273). But I found out that this table could be wrong!
By examining the PEPATAC_comands.sh, I noticed that after obtaining the peaks_coverage.bed file through bedtools coverage, you use awk to append a column as the 8th column in the file. This column primarily involves normalizing the 5th column of the bed file.
I think you might be attempting to normalize the read counts, enabling the comparison of peaks across different samples. However, a crucial point to note is that the bed file format obtained from bedtools coverage is: "chr", "start", "end", "read_count", "base_count", "width", "frac". In simpler terms, the read counts are in the 4th column, but you are standardizing the 5th column, which represents base counts.
I attempted to manually compress
ref_peak.bed
intoref_peak.bed.gz
usinggzip
. Subsequently, when runninglooper runp
, I successfully obtained a count table(#273). But I found out that this table could be wrong!By examining the
PEPATAC_comands.sh
, I noticed that after obtaining thepeaks_coverage.bed
file throughbedtools coverage
, you use awk to append a column as the 8th column in the file. This column primarily involves normalizing the 5th column of the bed file.I think you might be attempting to normalize the read counts, enabling the comparison of peaks across different samples. However, a crucial point to note is that the bed file format obtained from bedtools coverage is: "chr", "start", "end", "read_count", "base_count", "width", "frac". In simpler terms, the read counts are in the 4th column, but you are standardizing the 5th column, which represents base counts.