XiaoTaoWang / NeoLoopFinder

A computation framework for genome-wide detection of enhancer-hijacking events from chromatin interaction data in re-arranged genomes
Other
53 stars 16 forks source link

Missed assembled translocations in smaller bins #49

Closed xunchen85 closed 1 year ago

xunchen85 commented 1 year ago

Hi Xiaotao,

Really appreciate for your advices on running your neoloopfinder tool, it is so helpful.

Now I could use it to reassemble and call the loops at my SV list. I have a few target SVs in which I am particularly interested. We have experimentally validated them, and from the HIC matrix, the inter-chromasome translocation is also very obviously enriched for contacts between two regions. When I used Neoloopfinder to analyze it, I could successfully assemble it with a 25k bin size but when I ran it with 10k, it disappear in my calls.

I am wondering do you have any suggestions to recover it. I only could specify the minimum size and region size but can I still tune the parameters to do it?

If I have known a true translocation is true, can I particularly look at it with neoloopfinder? I failed last time when I tried to customize the assembled output for the loop calling. Do you know how I can do it?

Thanks, Xun

XiaoTaoWang commented 1 year ago

Hi Xun,

The "assemble-complexSVs" accepts a list of cool URIs at different resolutions. If you pass more than one cool URI through the "-H" parameter, it will first generate SV assemblies for each individual resolution, and then combine the results in a non-redundant way.

Therefore, to recover the SV assembly you mentioned, you can run the following command (suppose your mcool file name is "test.mcool" and your SV input file name is "test-SV.txt"):

$ assemble-complexSVs -O test-SV.txt --balance-type CNV --protocol insitu --nproc 6 -H test.mcool::resolutions/25000 test.mcool::resolutions/10000 

Let me know if you have any further questions.

Xiaotao

xunchen85 commented 1 year ago

Thanks Xiaotao,

It works well and I was able to assemble these complex SVs. Sorry, but I may have another two questions.

  1. I used a list of >100 translocations as the input but I only get 25% of them assembled. Because I am really interested in some of the missed ones. Do you know any reasons for it and how I could recover the 75%? Like inaccurate breakpoints or directions?

  2. The original format is in VCF or BED format which also contains the direction info. But if the format is incorrect, do you have any suggested tools for accurate conversion?

Best, Xun

xunchen85 commented 1 year ago

The proportion was significantly increased when I used the translocations directly detected by HiC data.

Thanks.