Closed bio-xy closed 1 year ago
Hi @bio-xy. I was able to reproduce your observation. Your process was probably killed due to the memory issue as the peak memory usage is around 9.3GB when using BLEND as you described.
I have two suggestions to reduce the peak memory usage. First, you can further decrease -I (e.g., -I 30M). Second, you can also reduce the mini-batch size (amount of bases loaded into the memory as a batch for mapping) with -K (e.g., -K 100M). The following uses around 6GB:
blend -ax map-ont -t 6 --secondary=no -I 30M -K 100M -a --split-prefix hg002 hg38.fa ont_small_chunk.fq.gz
One irrelevant suggestion: If you cut the entire fastq.gz into smaller chunks to reduce the memory usage, you do not have to do that. You can simply set -K to some value, which will ensure that BLEND processes a limited amount of sequences at a time (similar to minimap2).
Job (with above settings) still got killed when it begins with chr2 ...
I am not sure if there is any other way to make it more robust than cutting into smaller chunks..
Unfortunately, I could not reproduce this. Is there any other process that may be taking a large amount of memory such that the available memory becomes much smaller than 9GB when running BLEND? Here is the /usr/bin/time -vpo output I get when using your inputs and settings in a server with AMD EPYC 7742 processor with 1TB main memory (Maximum resident set size shows the peak memory in KB):
Command being timed: "blend -ax map-ont -t 6 --secondary=no -I 50M -a --split-prefix hg002 hg38.fa ont_small_chunk.fq.gz"
User time (seconds): 135116.26
System time (seconds): 808.26
Percent of CPU this job got: 590%
Elapsed (wall clock) time (h:mm:ss or m:ss): 6:23:37
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 10037040
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 1
Minor (reclaiming a frame) page faults: 1350500587
Voluntary context switches: 85887
Involuntary context switches: 165369
Swaps: 0
File system inputs: 9008088
File system outputs: 5271464
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
The following is the time output when using the settings I suggested above:
Command being timed: "blend -ax map-ont -t 6 --secondary=no -I 30M -K 100M -a --split-prefix hg2002 hg38.fa ont_small_chunk.fq.gz"
User time (seconds): 135620.18
System time (seconds): 1035.77
Percent of CPU this job got: 587%
Elapsed (wall clock) time (h:mm:ss or m:ss): 6:27:30
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 7627396
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 1548545951
Voluntary context switches: 88899
Involuntary context switches: 165942
Swaps: 0
File system inputs: 2503752
File system outputs: 5347544
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
The first one uses around 10GB and the second run uses around 7.6GB. If you need to run BLEND for further restrained resources, I would suggest decreasing the -I and/or -K accordingly.
I ran blend in wsl but not sure if any background process may interfere with this. Anyway, thanks for the info. I think I'd better get a big machine for this...
Sure. By the way I am using the "alignment ready" version of hg38 in the analysis set of hg38 (https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/analysisSet/) without the non-canonical contigs. Although non-canonical contigs are relatively smaller than the canonical ones, if you are including them in your analysis, your memory usage may also be slightly larger. I hope you can run your analysis. Closing this comment now but feel free to re-open it in case you get a similar issue.
Hi Blend team, I am trying to run blend on my laptop to map recently released ONT duplex reads. I cut single fastq.gz into small chunks (each contains 20000 reads) and run below command. But after generating some .tmp file, the process was killed (I believe exceeding max mem ~9GB here).
blend -ax map-ont -t 6 --secondary=no -I 50M -a --split-prefix hg002 hg38.fa ont_small_chunk.fq.gz
I am not quite sure about "-I 50M" just assuming blend will map reads to part of the whole index to save memory. Am I right? Any advice to run blend on platform with restrained resources? Or maybe it should not be run this way. Thanks a lot!
Original fastq is here: https://human-pangenomics.s3.amazonaws.com/submissions/0CB931D5-AE0C-4187-8BD8-B3A9C9BFDADE--UCSC_HG002_R1041_Duplex_Dorado/Dorado_v0.1.1/stereo_duplex/11_15_22_R1041_Duplex_HG002_1_Dorado_v0.1.1_400bps_sup_stereo_duplex_pass.fastq.gz