tba123 / rna-star

Automatically exported from code.google.com/p/rna-star
0 stars 0 forks source link

STAR not scaling to 60 cores... or even 10. #17

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Start STAR on a RNA dataset
2. Specify --runThreadN 60 (on a 160Core Machine)
3. Watch Star use only 6~7 cores (top shows <800% CPU)

What is the expected output? What do you see instead?

In your publication you write that STAR scales well,
well, I was hoping for that, but I can't get it to scale.

What version of the product are you using? On what operating system?
...since STAR does not have a version-output, hard to say, but I can try to 
update...
Fedora 20, 64bit

Please provide any additional information below.
I also played with the 'genomeLoad' params, as a colleague told me, he saw some 
strange things, and advised me to disable the genomeLoad thing.
However, I cannot see a difference in speed (or scaling) if using "genomeLoad 
NoSharedMemory" or "genomeLoad LoadAndKeep" (followed by "genomeLoad 
LoadAndKeep")

I'll try updating STAR, lets see.

Original issue reported on code.google.com by goo...@schwarzelan.de on 24 Feb 2014 at 1:25

GoogleCodeExporter commented 9 years ago
Updating to 2.3.0e did the trick...or so I thought.

For a couple of seconds I saw STAR using 5900% CPU (=59 Cores)
now (for the majority of the mapping time) it is down to 500% (=5 CPU cores)

...

Original comment by goo...@schwarzelan.de on 24 Feb 2014 at 4:11

GoogleCodeExporter commented 9 years ago
To be more concise:
With an imput of 2x12.5 mio sequences, STAR runs on 60 cores, for about a 
minute,
summing up 60-70 minutes of CPU time,
afterwards its slowgoing with 5 threads, until it is hitting the 120 minutes of 
CPUTime,
followed by another number of minutes in single-threaded mode.

Original comment by goo...@schwarzelan.de on 24 Feb 2014 at 4:42

GoogleCodeExporter commented 9 years ago
Please post your questions in the STAR forum 
https://groups.google.com/d/forum/rna-star for a faster reply.
As you increase the number of threads, the main limiting factor is the disk I/O 
bandwidth, especially writing of the SAM files. If you have enough RAM, please 
try running STAR from the /dev/shm/ virtual disk (both input and output files). 
This would give us an idea what portion of slowdown is caused by the disk 
bandwidth.

Original comment by adobin@gmail.com on 25 Feb 2014 at 10:23