XiaoTaoWang / NeoLoopFinder

A computation framework for genome-wide detection of enhancer-hijacking events from chromatin interaction data in re-arranged genomes
Other
53 stars 16 forks source link

Neoloopfinder doesn't find loops and assemblies that I think it should #31

Open auberginekenobi opened 2 years ago

auberginekenobi commented 2 years ago

H Xiaotao, great tool filling an urgent unmet need.

I am running neoloopfinder on the following SVs:

<MB268.simpleSVs.txt>
chr1    chr1    ++      198179654       205207358       translocation
chr1    chr1    --      198993881       198984836       translocation
chr1    chr1    +-      198993807       194206139       translocation

As you may be able to tell, these breakpoints form a cycle, and indeed this is the location of an ecDNA in this sample. Therefore I would expect a connected assembly to be called by assemble-complexSVs:

assemble-complexSVs -O ${SAMPLE}_${NORM}.$RES -B $SIMPLESVS -H $COOL \
        --minimum-size 0 --nproc $SLURM_CPUS_PER_TASK --region-size 12000000 \
        --balance-type $NORM
<Expected.assemblies.txt>
A0      translocation,1,198993881,-,1,198984836,-       translocation,1,198993807,+,1,194206139,-   1,205210000     1,198180000
C0      translocation,1,198993881,-,1,198984836,-       1,205210000     1,198990000
C1      translocation,1,198993807,+,1,194206139,-       1,198980000     1,198180000
C2      translocation,1,198179654,+,1,205207358,+       1,194210000     1,198990000
<Actual.assemblies.txt>
C0      translocation,1,194206139,-,1,198993807,+       1,195310000     1,198170000
C1      translocation,1,198179654,+,1,205207358,+       1,198120000     1,204190000

It appears that assemble-complexSVs filters the chr1 chr1 -- 198993881 198984836 translocation breakpoint and then fails to form any connected assembly.

Furthermore, if I run neoloop-caller on the Expected.assemblies.txt, the algorithm calls cis-loops within individual segments, but does not call any trans-loops or attribute any loops to the A0 assembly: MB268_10k_cnv How would I go about applying Neoloopfinder to my ecDNA+ HiC data?

XiaoTaoWang commented 2 years ago

Hi, thanks for your interest! The current complex SV assembling pipeline applies a set of relative stringent criteria to make sure every local assembly have smooth distance decay pattern of Hi-C contact frequencies across the breakpoint. And currently, the pipeline does not support the detection of circular DNA. But I do have plans to add this function in the next major version which might come out in the next a few months. I will keep you posted!