Open luzporras opened 2 months ago
Hey, apologies for the late response.
Here is some pseudocode for fanc compare
:
if input are matrices:
if comparison == 'fold-change':
use FoldChangeMatrix
else if comparison == 'difference':
use DifferenceMatrix
else if input are scores:
if comparison == 'fold-change':
use FoldChangeScores
elif comparison == 'difference':
use DifferenceScores
else if input are regions:
if comparison == 'fold-change':
use FoldChangeRegions
elif comparison == 'difference':
use DifferenceRegions
As you can see, fanc compare
uses DifferenceRegions
under the hood if you provide BED files and the --comparison difference
argument. The default, however, is to use fold-change - maybe that is where the difference stems from?
You can see the actual code here: https://github.com/vaquerizaslab/fanc/blob/d5d86085c920a4dca6e5f6be4857129d718243cc/fanc/commands/fanc_commands.py#L3533-L3548
So, the call would be
DifferenceRegions.from_regions(
matrix1, matrix2,
file_name=comparison_output,
tmpdir=tmp,
mode='w',
log=log
)
All fanc compare
does is to calculate either the difference or the fold-change of values in the BED for each region. There are no statistics involved.
I'm comparing insulation scores of two samples using FANC. I've created two .bed files—one through FANC compare and the other via fanc.DifferenceRegions.from_regions, both using a 50kb bin and a 150kb window. However, I've observed discrepancies between the outputs of these bed files, leaving me unsure of which one to utilize.
My main questions revolve around the statistical methods employed in generating these outputs. Specifically, I want to know what statistical analyses underlie the calculations used to generate these outputs and what the criteria are for determining the significance of differences between the insulation scores of the two samples.
Thanks, Luz P