rjurney / Agile_Data_Code_2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
http://bit.ly/agile_data_science
MIT License
456 stars 307 forks source link

Half of the time Agile_Data_Code_2 is not present in EC2 instance during startup #19

Closed md6nguyen closed 7 years ago

md6nguyen commented 7 years ago

About 50% of the time when I spinned up an EC2 instance, the directory Agile_Data_Code_2 does not exist and it turns out that it is because the apt-get process seems to get stuck. This is the first line in aws/ec2_bootstrap.sh: sudo apt-get update && sudo apt-get upgrade -y

I think there may be some timing issue here when apt-get is called before the system is ready. I think it is safer to remove the bootstrap script aws/ec2_bootstrap.sh from ec2.sh and let the user scp the script to the instance and run it manually once they ssh into the instance.

rjurney commented 7 years ago

Adding a boot script is a well known technique, but I have noticed the problem too and don't understand what is happening. I will look into it, thanks.

md6nguyen commented 7 years ago

I agreed it's very convenient if the bootstrap script is run automatically but it has to be reliable.

rjurney commented 7 years ago

There was an issue where sometimes I think the apt-get screen was coming up, that someone fixed in a pull request. Would you mind trying this again? I will do so from my side as well.

md6nguyen commented 7 years ago

Yes, it seems to be okay now. I tried 3 times and they all went through. However, the bootstrap script takes around 15 minutes to complete. I think we should log the status in a log file with a confirmation message when the setup is complete. User can check the log and wait until it is complete so that they won't start playing when the set up is in progress. I'll file a separate issue for this.