Open mcsimenc opened 9 months ago
@mcsimenc the 5th column, according to definition of narrowPeak represent the peak score. In MACS, the score is the integar form of 10 x -log10(qvalue) (9th column). In the callpeak.md file you refer to, you can find this description:
NAME_peaks.narrowPeak is BED6+4 format file which contains the peak locations together with peak summit, p-value, and q-value. You can load it to the UCSC genome browser. Definition of some specific columns are:
5th: integer score for display. It's calculated as int(-10log10pvalue) or int(-10log10qvalue) depending on whether -p (pvalue) or -q (qvalue) is used as score cutoff. Please note that currently, this value might be out of the [0-1000] range defined in UCSC ENCODE narrowPeak format. You can let the value saturated at 1000 (i.e. p/q-value = 10^-100) by using the following 1-liner awk: awk -v OFS="\t" '{$5=$5>1000?1000:$5} {print}' NAME_peaks.narrowPeak
Hi, I want to understand the difference between 5 and 10 in macs3 narrowPeak output. They are different, but column 10, the "relative summit position to peak start", corresponds to what is found in the summits.bed file. In IGV, field 10 is labeled as the "peak", and field 5 is labeled as "score". Here is an example:
summits.bed
narrowPeak
148 = 13972 - 13824
127 = ?
Thanks so much
BED field descriptions drawn from: https://github.com/macs3-project/MACS/blob/master/docs/callpeak.md