XiaoTaoWang / NeoLoopFinder

A computation framework for genome-wide detection of enhancer-hijacking events from chromatin interaction data in re-arranged genomes
Other
53 stars 16 forks source link

neo-loops/neotad result with the last column as 0 and 1 #14

Closed wzhang42 closed 2 years ago

wzhang42 commented 2 years ago

Xiaotao, I am ow using neoloopfinder for my data analysis and have the following questions. The .neo-loops file (also .neo-tad file) have many rows with the last column as 0. I am wondering whether these rows are meaningful (In your definition, the last column as 1 indicate a neo-loop or neo-tad, and the detected interaction loop is affected by a SV ). If they are not meaningful, why keep it? If yes, what's the difference of these rows with 0 against other rows with 1. What's the rule to give the binary value 0-1?

Additionally, the neoloop-caller and neotad-caller are called independently . Some loops (rows) in the .neo-loops are not classified as neo-loops (last column as 0), but the corresponding row in .neo-tad files can be classified as neo-tads (last column as 1). This is normal?
Thank you so much in advance.

XiaoTaoWang commented 2 years ago

Hi, your understanding is correct, 1s represent neo-loops that are across the breakpoint, 0s represent "regular loops" that are located within the undisrupted region. I keep those 0s just in case it's helpful, you can simply ignore them if you only care about neo-loops.

Xiaotao

XiaoTaoWang commented 2 years ago

For the second question, what do you mean by "corresponding row in Neo-tad files"? If the loop coordinates and tad coordinates are exactly the same, it's definitely not normal.

wzhang42 commented 2 years ago

Xiaotao, Thanks for your reply. To the 1st questions: Since 0s means "regular loops", why one or more SVs (C#) are appended at the ends, such as the following. ... chr7 131150000 131175000 chr7 131300000 131325000 C4,150000,0 chr7 131150000 131175000 chr7 131325000 131350000 C6,175000,0,C4,175000,0 I just guess the first loops (regular loop) also have some relation with C4, but not strong if the last column as 1.

To the 2nd questions, I have the following .neo-loops and .neotad results (I thought the bold parts(**) are the corresponding parts that I mentioned)

.neo-loops ... chr7 6650000 6660000 chr7 6760000 6770000 C7,110000,0 chr7 6660000 6670000 chr7 6770000 6780000 C7,110000,0 chr7 132720000 132730000 chr7 132830000 132840000 C10,110000,0 chr7 132720000 132730000 chr7 133020000 133030000 C10,300000,0 chr7 132740000 132750000 chr7 133010000 133020000 C8,270000,0 chr7 132750000 132760000 chr7 132860000 132870000 C10,110000,0 chr7 132750000 132760000 chr7 132870000 132880000 C8,120000,0,C10,120000,0 chr7 132750000 132760000 chr7 132900000 132910000 C8,150000,0,C10,150000,0 ...

.neotad chr7 6530000 6540000 chr7 6630000 6640000 C7,100000,0 chr7 132670000 132680000 chr7 132760000 132770000 C10,90000,0 chr7 132680000 132690000 chr7 132760000 132770000 C8,80000,0 chr7 132760000 132770000 chr7 133070000 133080000 C8,310000,0 chr7 132760000 132770000 chr7 133080000 133090000 C10,320000,0 chr7 133070000 133080000 chr8 128930000 128940000 C8,150000,1 chr7 133080000 133090000 chr8 128920000 128930000 C10,150000,1 chr7 149980000 149990000 chr7 150130000 150140000 C9,150000,0

XiaoTaoWang commented 2 years ago
  1. Both loops your pasted above are actually regular loops, not neo-loops. Although they are located at the local assembly formed by an SV, the anchors of them are not disrupted by the SV breakpoints (neo-loops specifically refer to those with one anchor located at one side the breakpoint, while the other anchor located at the other side of the breakpoint).
  2. the definition of Neo-tads is also similar, therefore, it is not contradictory that a regular loop and a neo-tad are located at the same local assembly.

Hope my explanation makes it clearer.

Best, Xiaotao

wzhang42 commented 2 years ago

Xiaotao, Many thanks for your confirmation. The above .neo-loops file and .neo-tads file are based on a 10K resolution of our Hi-C Data. In the neo-loops results, we can not find the corresponding inter-chromosomal neo-loops chr7 (133070000 133080000 ) -- chr8 (128930000 128940000), but found the inter-chromosomal neo-tads chr7 133070000 133080000 chr8 128930000 128940000 C8,150000,1 I previously thought that neo-tad should be a further results of neo-loops. Only the corresponding neo-loops are detected, then the neo-tads can be detected. Now, it seems not. Additionally, I also tried my 25K resolution Hi-C data, and found that both the corresponding neo-loops and neo-tads. .neo-loops (25K)
chr7 132900000 132925000 chr7 133050000 133075000 C4,150000,0 chr7 133075000 133100000 chr8 128725000 128750000 C4,350000,1 chr7 133075000 133100000 chr8 128750000 128775000 C4,325000,1,C6,325000,1 ....

.neo-tads (25K) ... chr7 133050000 133075000 chr8 128700000 128725000 C6,400000,1 chr7 133050000 133075000 chr8 128725000 128750000 C4,375000,1 ....

XiaoTaoWang commented 2 years ago

Looks good.