c-zhou / yahs

Yet another Hi-C scaffolding tool
MIT License
126 stars 18 forks source link

Running yahs by Hi-C library #44

Open minjeongjj opened 1 year ago

minjeongjj commented 1 year ago

Hi

I tried to scaffold my contig level genome with my hic data but failed with the segmentation fault error. I think this error can be occurred due to the large hic alignment, which is about 700Gb in bam file format... So is there any way to run yahs by each hic library and merge at any specific step to avoid segmentation fault core dumped issue?

Here are the command i used and error messages.

$ yahs Combined_pseudohap.phased.filtered.0.arcs.fasta Pinetree_HiC.bwa_aln.bam >yahs.log 2>yahs.log2

1756222 Segmentation fault (core dumped) yahs Combined_pseudohap.phased.filtered.0.arcs.fasta Pinetree_HiC.bwa_aln.bam > yahs.log 2> yahs.log2

[I::dump_links_from_bam_file] dumped 1240693787 read pairs from 8033981476 records: 710480207 intra links + 530213580 inter links [I::run_yahs] RAM total: 1133.532GB [I::run_yahs] RAM limit: 3.019GB [I::contig_error_break] dist threshold for contig error break: 1000000

Thank you!

Sincerely, MJ

c-zhou commented 1 year ago

Hello @minjeongjj,

From the log, only about 3Gb memory was available at the runtime (although your device has more than 1Tb of memory). Is your device on a shared cluster? Is it possible to check the actual memory consumption of yahs?

Memory consumption mainly depends on the size of your genome assembly. The size of the BAM file, or more precisely, the HiC data coverage, does not really matter.

Best, Chenxi