Closed abhisheksinghnl closed 5 years ago
You can use hicConvertFormat
to export a h5 or cool file to GInteractions format. This will give you ALL interactions as a bed-like file.
Loop detection gives you the meaningful interactions in a validated manner. If you want to see long range contacts between specific regions, it is better to use hicAggregateContacts
than playing around a GInteractions file.
Hi,
Thank you for your reply. I got a file in this format:
chr1B 487650000 487675000 chr1B 502625000 502650000 2
chr1B 487650000 487675000 chr1B 502775000 502800000 1
chr1B 487650000 487675000 chr1B 502875000 502900000 1
Just to confirm, the first three columns are the starting co-ordinate of the interaction site and the column no. 4-6 are the interaction site, what is the last column representing?
I am guessing column 7 is representing interaction counts and if that is the case normally what is a good number in that column?
The first three column = bin1 The second three column = bin2 Last column = number of interactions, i.e. the number of read pairs corresponding to bin1 and bin2.
A "good number" is very hard to define... It will highly depend on the distance between bin1 and bin2 of course. This is not a trivial question.
If you want to see if some interactions are enriched against background, you need a proper way to do it. HiCExplorer is beginning to support such approaches and it boils down to loop-calling, as you said in your first message. The problem is that is depends on your genome mostly... For mammalian genomes I think it is not an issue, for anything else it can be problematic. If you want to see if some loops were missed somehow, I think a good way would be to make a obs/exp matrix with hicPCA
, or just a normal matrix (corrected) and plot it alongside the loops with hicPlotMatrix
that now supports loop plotting. If you see "loop-like" regions on the matrix that are not called, then maybe you need to tweak your parameters.
Alternatively you can subset your GInteraction file for the loop position you identified to give you an idea about "what is a good number". You can do it this way in R:
library(GenomicRanges)
library(InteractionSet)
convertToGI <- function(df){
row.regions <- GRanges(df$V1, IRanges(df$V2,df$V3))# interaction start
col.regions <- GRanges(df$V4, IRanges(df$V5,df$V6))# interaction end
gi <- GInteractions(row.regions, col.regions)
gi$norm.freq <- df$V7 # Interaction frequencies
return(gi)
}
df<-read.table("file.GInteractions",sep="\t")
loops<-read.table("loops.txt",sep="\t")
df.gi <- convertToGI(df)
loops.gi <- convertToGI(loops)
df.loops.gi<-subsetByOverlaps(df.gi ,loops.gi)
But I guess that your loop file is already stating these numbers anyways.
Hi,
I have build my interaction matrix and now I want to get the interactions out of this matrix.
How can I do this?
I have performed LoopDetection calling and I got some loops. Is this all I can do or something more I can do to get the interaction of the regions that might have gone undetected during Loop detection step.
Thank you