Memory Calculation Doubts

magcurly commented 2 years ago

To whom may concerned,

Hi, I am testing your software on my RNA Metaviromes data under CPU mode. I am not clear with the memory calculation.

With your example identified in your README.md, it seems that with chunk size as 1024 and threads as 20, it should take about 20GB memory in total. Since I don't have that many threads, I set the threads as 8 and I found the maximum memory took around 200GB. I am really confused about the calculation of memory and I hope I can reduce the memory because I am going to run those tasks on a SGE cluster. Jobs with large memory demanded take ages in queue.

Looking forward to your respond.

Sincerely, magcurly.

dawnmy commented 2 years ago

Hi, thank you for your interest in RiboDetector. Which version did you use and how large is your input fastq file? The 8 processes use shared memory which means if each proc uses 25G showing in htop the total memory use is 25G not 200G. It would be great if you can post a screenshot of htop whiling running RiboDetector.

magcurly commented 2 years ago

It's great to hear from you, Dr. Deng! the command is ribodetector_cpu -t 8 -l 150 -i sample_rmhost_{1,2}.fq.gz -e norrna -o sample_rdnonrrna_{1,2}.fq.gz. The input fastq is about 10Gbases in total

I used a nohup to test it and the error output's like,

It did not seem to be sharing memory when I ran the task.

dawnmy commented 2 years ago

htop screen shot shows RiboDetector used 35G memory in total. you see the two procs used the same 35G, this is the shared memory. There must be some other programs running which occupied ~70% of total memory. Since you don't have enough memory, it would be better to set a smaller chunk_size e.g. 128. BTW, which version of RiboDetector did you use? I released a new version recently which solved CPU blocking and memory issue for some users.

magcurly commented 2 years ago

htop screen shot shows RiboDetector used 35G memory in total. you see the two procs used the same 35G, this is the shared memory. There must be some other programs running which occupied ~70% of total memory. Since you don't have enough memory, it would be better to set a smaller chunk_size e.g. 128. BTW, which version of RiboDetector did you use? I released a new version recently which solved CPU blocking and memory issue for some users.

I am using 0.2.4 if you wanna know. I submitted the RiboDetector task to the big memory computing node, and it returned large memory used like 500-700 GB in qstat summary. I xtop the node and I found used memory of the entire node is far less than the it showed in qstat summary.

dawnmy commented 2 years ago

htop screen shot shows RiboDetector used 35G memory in total. you see the two procs used the same 35G, this is the shared memory. There must be some other programs running which occupied ~70% of total memory. Since you don't have enough memory, it would be better to set a smaller chunk_size e.g. 128. BTW, which version of RiboDetector did you use? I released a new version recently which solved CPU blocking and memory issue for some users.

I am using 0.2.4 if you wanna know. I submitted the RiboDetector task to the big memory computing node, and it returned large memory used like 500-700 GB in qstat summary. I xtop the node and I found used memory of the entire node is far less than the it showed in qstat summary.

Good to know this. I was told the same. I think qstat estimated the memory use simply by adding up the memory used by each process without considering shared memory mechanism. And this may lead to the task being terminated by SGE due to memory limit. I am looking for a solution/workaround although I think this is a bug on the SGE side.

dawnmy commented 2 years ago

@magcurly The new release v0.2.6 can substantially reduce the memory use with chunk_size parameter. You can update it with pip.

magcurly commented 2 years ago

@magcurly The new release v0.2.6 can substantially reduce the memory use with chunk_size parameter. You can update it with pip.

sorry for the lateness of responding. It's a great improvement as I found about 25GB used according to qstat when I set chunk_size=1024 and 8 threads. It's such a great help for me and my colleagues setting up RiboDetector into our snakemake pipeline. Great thanks to you, @dawnmy .

hzi-bifo / RiboDetector

Memory Calculation Doubts #12