NREL / buildstockbatch

Other
20 stars 13 forks source link

Incorrect AMI #448

Closed phgupta closed 2 months ago

phgupta commented 2 months ago

I'm having issues running ResStock simulations on AWS since yesterday. This is an issue that has popped up out of the blue which makes it very peculiar.

I ran a simulation and no EC2 instances were spinning up. I looked at AWS Batch and it gave the error “UNDETERMINED – Batch job is blocked, root cause is undetermined.” I googled the potential causes of the error, and I don’t see any of them applying to my use case because the simulations were running just fine until a month ago.

I did a little bit of digging in the codebase and I saw that the AMI ID is set in awsbase.py, “self.batch_compute_environment_ami = ami-0184013939261b626”. I searched for that AMI ID in AWS EC2, and I couldn’t find that image!! Maybe AWS deleted it? I’m not sure.

I tried instead setting the “self.batch_compute_environment_ami” to the following - Ubuntu Server 22.04 LTS (HVM), SSD Volume Type (ami-08116b9957a259459) and Amazon Linux 2023 AMI (ami-0395649fbe870727e). I chose the 64-bit x86 images because the Dockerfile builds on the nrel/openstudio:3.7.0 image which in turn uses linux/amd64. With those two AMI IDs, at least the EC2 instances were spinning up, however, AWS Batch was still giving the same error.

I have tried running a sample simulation on both buildstockbatch-2023.06.0/resstock-3.1.0 as well as buildstockbatch-2023.11.0/resstock-3.2.0.

Is this an issue with AMI or could it be something else? If it is an issue with AMI, then what kind of an image was ami-0184013939261b626? Can you point me to any alternatives?

nmerket commented 2 months ago

You should use the latest from the develop branch, or at this point actually #447. Buildstockbatch no longer uses a hard-coded ami, but uses a launch template to configure the batch instances more flexibly.