XiaoTaoWang / NeoLoopFinder

A computation framework for genome-wide detection of enhancer-hijacking events from chromatin interaction data in re-arranged genomes
Other
53 stars 16 forks source link

Number of Redundant Candidates in assemble-complexSVs #41

Closed alobo4 closed 1 year ago

alobo4 commented 1 year ago

Hello,

I am running assemble-complexSVs for an HCC1954 Hi-C data, GSM3258551, SRR7475914, the same cell-line used in the NeoLoopFinder paper. The command is taking an extremely long time (>4 days) even with 256MB requested for memory as the number of redundant candidates are 84996, 141704, and 114354 for the 5kb, 10kb, and 25kb resolutions, respectively. You mentioned in your tutorial that assemble-complexSVs should only take around ~6 mins to complete. I understand that was a test example but I am curious if my numbers are expected for a bigger sample or if something messed up in previous steps. I inferred my SVs using EagleC with the NeoLoopFinder output. Here is the command I am running and what the logging file shows: assemble-complexSVs -O HCC1954 -B HCC1954.CNN_SVs.NeoLoopFinder.txt --balance-type CNV --protocol insitu --nproc 6 \ -H HCC1954-MboI-R1-filtered.mcool::resolutions/25000 \ HCC1954-MboI-R1-filtered.mcool::resolutions/10000 \ HCC1954-MboI-R1-filtered.mcool::resolutions/5000

root INFO @ 12/02/22 12:13:04: # ARGUMENT LIST: # Output Prefix = HCC1954 # Break Points = HCC1954.CNN_SVs.NeoLoopFinder.txt # Minimum fragment size = 500000bp # Cooler URI = ['HCC1954-MboI-R1-filtered.mcool::resolutions/25000', 'HCC1954-MboI-R1-filtered.mcool::resolutions/10000', 'HCC1954-MboI-R1-filtered.mcool::resolutions/5000'] # Extended Genomic Span = 5000000bp # Balance Type = CNV # Experimental protocol = insitu # Number of Processes = 6 # Log file name = assembleSVs.log root INFO @ 12/02/22 12:13:24: Current resolution: 25000 root INFO @ 12/02/22 12:13:24: Calculate the global average contact frequencies at each genomic distance ... root INFO @ 12/02/22 12:14:04: Done root INFO @ 12/02/22 12:14:04: Filtering SVs by checking distance decay of chromatin contacts across SV breakpoints ... root INFO @ 12/02/22 12:17:52: 296 SVs left root INFO @ 12/02/22 12:17:52: Building SV connecting graph ... root INFO @ 12/02/22 12:17:52: Discovering and re-ordering complex SVs ... neoloop.assembly INFO @ 12/02/22 12:20:02: Filtering 114354 redundant candidates ...

XiaoTaoWang commented 1 year ago

Hi, thanks for reporting this. I recently updated NeoLoopFinder so that it can deal with smaller SVs than what we analyzed in the original paper, but didn't notice the running time complexity issue. I will take a look at this and get back to you later this week or next week.

Best, Xiaotao

alobo4 commented 1 year ago

Great! Thank you!!

On Dec 12, 2022, at 8:39 AM, Xiaotao Wang @.***> wrote:

Hi, thanks for reporting this. I recently updated NeoLoopFinder so that it can deal with smaller SVs than what we analyzed in the original paper, but didn't notice the running time complexity issue. I will take a look at this and get back to you later this week or next week.

Best, Xiaotao

— Reply to this email directly, view it on GitHub https://github.com/XiaoTaoWang/NeoLoopFinder/issues/41#issuecomment-1346527343, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXLDSCVBKMRAZ6S53KNCETDWM4TITANCNFSM6AAAAAASV56WPY. You are receiving this because you authored the thread.

XiaoTaoWang commented 1 year ago

Hi, can you upgrade you NeoLoopFinder to the latest version (v0.4.3) by pip install -U neoloop and try again? In my test, I finished the job within 1hr with this version.

alobo4 commented 1 year ago

Yep! That worked and completed in about 1:30 hours. Thank you so much!

On Dec 18, 2022, at 10:13 AM, Xiaotao Wang @.***> wrote:

Hi, can you upgrade you NeoLoopFinder to the latest version (v0.4.3) by pip install -U neoloop and try again? In my test, I finished the job within 1hr with this version.

— Reply to this email directly, view it on GitHub https://github.com/XiaoTaoWang/NeoLoopFinder/issues/41#issuecomment-1356819703, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXLDSCVFW6SZKEYN5F3XJKLWN4S3JANCNFSM6AAAAAASV56WPY. You are receiving this because you authored the thread.