Closed magcurly closed 2 years ago
Hi, thank you for your interest in RiboDetector. Which version did you use and how large is your input fastq file? The 8 processes use shared memory which means if each proc uses 25G showing in htop the total memory use is 25G not 200G. It would be great if you can post a screenshot of htop whiling running RiboDetector.
It's great to hear from you, Dr. Deng!
the command is ribodetector_cpu -t 8 -l 150 -i sample_rmhost_{1,2}.fq.gz -e norrna -o sample_rdnonrrna_{1,2}.fq.gz
.
The input fastq is about 10Gbases in total
I used a nohup to test it and the error output's like,
It did not seem to be sharing memory when I ran the task.
htop
screen shot shows RiboDetector used 35G memory in total. you see the two procs used the same 35G, this is the shared memory. There must be some other programs running which occupied ~70% of total memory. Since you don't have enough memory, it would be better to set a smaller chunk_size e.g. 128. BTW, which version of RiboDetector did you use? I released a new version recently which solved CPU blocking and memory issue for some users.
htop
screen shot shows RiboDetector used 35G memory in total. you see the two procs used the same 35G, this is the shared memory. There must be some other programs running which occupied ~70% of total memory. Since you don't have enough memory, it would be better to set a smaller chunk_size e.g. 128. BTW, which version of RiboDetector did you use? I released a new version recently which solved CPU blocking and memory issue for some users.
I am using 0.2.4 if you wanna know. I submitted the RiboDetector task to the big memory computing node, and it returned large memory used like 500-700 GB in qstat
summary. I xtop
the node and I found used memory of the entire node is far less than the it showed in qstat
summary.
htop
screen shot shows RiboDetector used 35G memory in total. you see the two procs used the same 35G, this is the shared memory. There must be some other programs running which occupied ~70% of total memory. Since you don't have enough memory, it would be better to set a smaller chunk_size e.g. 128. BTW, which version of RiboDetector did you use? I released a new version recently which solved CPU blocking and memory issue for some users.I am using 0.2.4 if you wanna know. I submitted the RiboDetector task to the big memory computing node, and it returned large memory used like 500-700 GB in
qstat
summary. Ixtop
the node and I found used memory of the entire node is far less than the it showed inqstat
summary.
Good to know this. I was told the same. I think qstat
estimated the memory use simply by adding up the memory used by each process without considering shared memory mechanism. And this may lead to the task being terminated by SGE due to memory limit. I am looking for a solution/workaround although I think this is a bug on the SGE side.
@magcurly The new release v0.2.6 can substantially reduce the memory use with chunk_size
parameter. You can update it with pip.
@magcurly The new release v0.2.6 can substantially reduce the memory use with
chunk_size
parameter. You can update it with pip.
sorry for the lateness of responding. It's a great improvement as I found about 25GB used according to qstat
when I set chunk_size=1024 and 8 threads. It's such a great help for me and my colleagues setting up RiboDetector into our snakemake pipeline.
Great thanks to you, @dawnmy .
To whom may concerned,
Hi, I am testing your software on my RNA Metaviromes data under CPU mode. I am not clear with the memory calculation.
With your example identified in your README.md, it seems that with chunk size as 1024 and threads as 20, it should take about 20GB memory in total. Since I don't have that many threads, I set the threads as 8 and I found the maximum memory took around 200GB. I am really confused about the calculation of memory and I hope I can reduce the memory because I am going to run those tasks on a SGE cluster. Jobs with large memory demanded take ages in queue.
Looking forward to your respond.
Sincerely, magcurly.