Took your software for a nice spin again! This time, I hit performance issues when running it in parallel on a 64-core EC2 instance with almost half a Terabyte of RAM. I traced the issue to my flock of MicrobeCensus instances putting all of their temporary files on the slow EBS disk, rather than the nice RAMDISK I had provisioned for the running directory.
From poking around the Python code and RTFM, it seems that the choice of the mkstemp() location can be controlled using environment variables like TMPDIR:
https://docs.python.org/2/library/tempfile.html
I would ask if you could add some mention of this in the MicrobeCensus docs, in the hope that it might help the next person wondering how to optimize the disk I/O.
Hey Stephen,
Took your software for a nice spin again! This time, I hit performance issues when running it in parallel on a 64-core EC2 instance with almost half a Terabyte of RAM. I traced the issue to my flock of MicrobeCensus instances putting all of their temporary files on the slow EBS disk, rather than the nice RAMDISK I had provisioned for the running directory.
From poking around the Python code and RTFM, it seems that the choice of the mkstemp() location can be controlled using environment variables like TMPDIR: https://docs.python.org/2/library/tempfile.html
I would ask if you could add some mention of this in the MicrobeCensus docs, in the hope that it might help the next person wondering how to optimize the disk I/O.
Thanks again for your quality software!