PouletAxel / SIP

SIP: Significant Interaction Peak caller
GNU General Public License v3.0
13 stars 3 forks source link

Interpreting SIP Output #21

Closed danieljrichard closed 1 year ago

danieljrichard commented 1 year ago

Hello, I had recently downloaded a processed SIP output file from a study on GEO datasets, and I was wondering about how I might interpret the outputs. In particular, is there some sort of metric column with which I should filter columns, or can I safely assume that if the fdr parameter was set that all interactions output will be significant? The parameters used in the initial call were: -norm KR -g 2.0 -min 2.0 -max 2.0 -mat 2000 -d 6 -res 5000 -sat 0.01 -t 2000 -nbZero 6 -factor 1 -fdr 0.05 -del true -cpu 1 -isDroso false

I looked through the documentation/wiki and unfortunately couldn't find a clear explanation of the bedpe output - any information you might provide would be much appreciated.

Daniel

jordrow commented 1 year ago

Hi Daniel, We use multiple measures to determine loops. If you want more or less stringent loop calls, we recommend recalling loops with adjusted parameters rather than performing post-filtering. Here's a description of the columns which can also be found in the manual located with the releases. Each row corresponds to 15 columns: 1: Left anchor chromosome 2: Left anchor coordinate start 3: Left anchor coordinate end 4: Right anchor chromosome 5: Right anchor coordinate start 6: Right anchor coordinate end 7: color 8: APScoreAvg 9: Poisson Probability Score 10: RegAPScoreAvg 11: Avg_diffMaxNeihgboor_1 12: Avg_diffMaxNeihgboor_2 13: avg 14: std 15: value

PouletAxel commented 1 year ago

Hi Daniel, Sorry for the late answer. Yes you can use them like that, it should be good, after if the loops detected doesn't match the HiC signals, you can adapt the parameters. Here a more detail list of the bedpe file:

chromosome1: chrName1 x1: start chr1 x2: end chr1 chromosome2: chrName2 y1: start chr2 y2: end chr2 color: color for the loops display on juicer (black default) APScoreAvg: peak analysis score of the loop ProbabilityofEnrichment: Poisson Probability Score RegAPScoreAvg: regional peak analysis score (taking the Average value of the neighborhood 9 of the loop) Avg_diffMaxNeihgboor_1: avg differential between the loops and the 8 forming the first neighborhood Avg_diffMaxNeihgboor_2: avg differential between the loops and the 24 forming the first and second neighborhood avg: Average value of the neighborhood 9 of the loop std: Std value of the neighborhood 9 of the loop value: value of the loop

danieljrichard commented 1 year ago

Fantastic - thank you both for the detailed replies! I believe I'll use the loop calls as-is then, and if need be seek to re-run the pipeline rather than post-hoc filtering, as recommended.

Best regards,

Daniel

PouletAxel commented 1 year ago

Added those information here: https://github.com/PouletAxel/SIP/wiki/SIP-Quick-Start