I've been going back and forth with Lisa Gerhardt at NERSC, and it should be possible now, I think, to do everything here in the Shifter environment, which runs Docker containers on the compute nodes. This moves metadata operations to the compute nodes, which drastically decreases python import times, so there's some chance for performance improvement here. (I don't know if it would help with the sextractor time, I'll have to try.)
I've had the docker image mostly set up for awhile. So hopefully there will be a branch for this coming soon. On top of the performance improvements, this completely removes installation from the equation which is awesome. (But unfortunately not all supercomputing center have an HPC containerization setup -- it's a burgeoning field.)
If you think like me, this supercomputing + containerization is just super awesomely cool so I hope to have time for this!
I've been going back and forth with Lisa Gerhardt at NERSC, and it should be possible now, I think, to do everything here in the Shifter environment, which runs Docker containers on the compute nodes. This moves metadata operations to the compute nodes, which drastically decreases python import times, so there's some chance for performance improvement here. (I don't know if it would help with the
sextractor
time, I'll have to try.)I've had the docker image mostly set up for awhile. So hopefully there will be a branch for this coming soon. On top of the performance improvements, this completely removes installation from the equation which is awesome. (But unfortunately not all supercomputing center have an HPC containerization setup -- it's a burgeoning field.)
If you think like me, this supercomputing + containerization is just super awesomely cool so I hope to have time for this!