radical-collaboration / extasy-bpti

0 stars 1 forks source link

Containters and VMs on BlueWaters for Software Robustness #6

Open FranklinBetten opened 6 years ago

FranklinBetten commented 6 years ago

I have very limited time to try to rebuild and test the parts of my WorkFlow in the latest EnTK version as I only have time after work to attempt to finish this project. I was thinking maybe their is a solution that could keep me working on the molecular science side of things and minimize the need for software patching and support in the future.

My current software stack that works on my Laptop is radical.pilot==0.45.3 saga-python==0.45.1 radical.utils==0.45 radical.ensemblemd

This stack was working on BW up through January 2018. Now the stack no longer works on BW. I have tried rebuilding VEs and etc which worked before in the past but does not seem to be a working solution this time.

I had a couple ideas on this matter.

  1. Attempt to use VMs to run a job as a "local" RP job on BW with the specific software stack I need installed on it which I know for sure runs. Then have it output files directly onto BW file sytsem for storage/ working with later. This has the added bonus of enabling me to run multiple jobs simultaneously which I currently cannot bec of issue #3
  1. Attempt to use Containers to do roughly the same thing as stated in "1". I am not well read on containers but from what I understand we could assemble the libraries we need in a container and use those libraries for running our jobs. This would insulate my jobs form changes on BWs and the radical stack since I am using is all older releases it shouldn't change correct? Which means using containers these jobs should stay working once I get them working, or is this a false assumption?

-I have discussed with some of the BW admins and they support the container program called Singularity and another called Shifter.

https://bluewaters.ncsa.illinois.edu/singularity

https://bluewaters.ncsa.illinois.edu/shifter

Vivek and Andrea

Question 1 - What are your thoughts on this and could you help me set up a container for my work so we can try this on BW?

Question 2 - Do you think we could use containers to "trick" RP jobs on BW into running as "local" machines so that I can run many jobs simultaneously?

vivek-bala commented 6 years ago

I don't have much experience with Singularity or shifter so I won't be able to help much on that front. Also, I am not aware of a case where we have had any components of RP or EnTK running in a container, so that might be a totally new attempt that you are considering (with its difficulties and bugs).

I do however recommend setting up the VE on BW as Andre mentioned in the email and using EnTK 0.7 (or even 0.6.3 if it works out of the box for you). Several ppl in the lab have been using the latest stack on BW successfully and you can expect any issue (in RP and EnTK) to be addressed asap due to the wide usage as well. It'll be useful if you share the specific errors that you face on BW.

andre-merzky commented 6 years ago

I echo Vivek's remarks. As useful as containers sound in this case, there is insufficient expertise in the group to support this on a short time scale.