Closed grantbuster closed 2 years ago
And i should note that we started scoping this out and then realized "hey we can already do all of this thanks to michael and john readey!"
Well done, glad it wasn't to "scary"! So by local HSDS cluster are you just running a single docker container on each "node" or are you running a HSDS cluster in parallel to the "nodes".
A little scary but I ended up getting hooked and needed to figure it out haha.
Yeah so we got lambda working but the problem with that is reV has really small regular requests and each lambda call needs to spin up a service which has a lot of overhead.
The current solution is to submit reV jobs on the AWS parallel cluster via slurm (just like the NREL HPC). When a reV job gets an EC2 node it runs a shell script first that checks to see if there is an HSDS server running on that node. If the node just spun up, the shell script starts the local HSDS server on that node with N parallel docker instances. If the node already has an HSDS server running, it just moves on. It works pretty well!!
Kubernetes for the HSDS service works too but its more setup and requires more EC2 instances so i think we'll just put this aside for now.
Thats awesome! Well done and yay for HSDS!
yay indeed! you and John kicked ass.
Haha hey there! Yes we did it! Did you check out the aws_pcluster readme? We tried a few things and i figured out how to get HSDS local servers set up with minimal effort. It runs!