snayfach / MicrobeCensus

MicrobeCensus estimates the average genome size of microbial communities from metagenomic data
http://genomebiology.com/2015/16/1/51
GNU General Public License v3.0
41 stars 16 forks source link

Doc request: TMPDIR #13

Closed taltman closed 7 years ago

taltman commented 7 years ago

Hey Stephen,

Took your software for a nice spin again! This time, I hit performance issues when running it in parallel on a 64-core EC2 instance with almost half a Terabyte of RAM. I traced the issue to my flock of MicrobeCensus instances putting all of their temporary files on the slow EBS disk, rather than the nice RAMDISK I had provisioned for the running directory.

From poking around the Python code and RTFM, it seems that the choice of the mkstemp() location can be controlled using environment variables like TMPDIR: https://docs.python.org/2/library/tempfile.html

I would ask if you could add some mention of this in the MicrobeCensus docs, in the hope that it might help the next person wondering how to optimize the disk I/O.

Thanks again for your quality software!

snayfach commented 7 years ago

Thank for the useful feedback, Tomer! I've added this information to the README.