rjurney / Agile_Data_Code_2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
http://bit.ly/agile_data_science
MIT License
456 stars 306 forks source link

issue running on ec2: ERROR: JAVA_HOME /usr/lib/jvm/java-8-oracle does not exist. #99

Closed Cheukting closed 5 years ago

Cheukting commented 5 years ago

Running EC2 instance by using ec2.sh, after connecting the error message: 'ERROR: JAVA_HOME /usr/lib/jvm/java-8-oracle does not exist.' shows up. Seems all application depends on Java is not installed as in the 'kafka/' and 'spark/' are both empty.

mpuels commented 5 years ago

I have the same issue. The problem is that the PPA (https://launchpad.net/~webupd8team/+archive/ubuntu/java) that's used to install Java 8 doesn't support Ubuntu 17.10. Hence, I've modified the AMI id in ec2.sh for my region us-east-2 to an EC2 image with Ubuntu 18.04: ami-0f65671a86f061fcd. Hope that helps!

rjurney commented 5 years ago

It seems I need to upgrade the entire thing to Ubuntu 18.X. I'll try to do that this weekend.

rjurney commented 5 years ago

I fixed this.

arnaudbouffard commented 5 years ago

@rjurney it seems this issue is still open.

Seeing you fixed the elasticsearch not launching issue recently (https://github.com/rjurney/Agile_Data_Code_2/issues/73#issuecomment-489534617), I restarted the entire EC2 setup process from the beginning (and thus benefiting from your latest commits).

I unfortunately did not manage to get up to the elasticsearch part. The pyspark command did not work (even though it had worked a few days ago).

I did some diagnosis:

ubuntu@ip-172-31-35-113:/usr/lib$ ls accountsservice file language-selector linux-boot-probes os-release software-properties apt gcc libDeployPkg.so.0 locale pm-utils ssl binfmt.d git-core libDeployPkg.so.0.0.0 lxcfs policykit-1 sudo byobu gnupg libguestlib.so.0 lxd python2.7 systemd cloud-init groff libguestlib.so.0.0.0 man-db python3 tar command-not-found grub libhgfs.so.0 mime python3.6 tc dbus-1.0 grub-legacy libhgfs.so.0.0.0 modules-load.d python3.7 tmpfiles.d dpkg initcpio libvgauth.so.0 openssh rsyslog ubuntu-release-upgrader dracut initramfs-tools libvgauth.so.0.0.0 open-vm-tools sasl2 update-notifier eject kernel libvmtools.so.0 os-prober sftp-server valgrind environment.d klibc libvmtools.so.0.0.0 os-probes snapd x86_64-linux-gnu

How did you fix this yesterday (https://github.com/rjurney/Agile_Data_Code_2/issues/99#issuecomment-489511157)? Shouldn't the EC2 setup also use Ubuntu Bionic, like the Vagrant Setup?

I've again spent quite some time on the setup today, and am eager to move on to the actual Data Science soon. I could continue trying to fix this (looking for the Ubuntu Bionic EC2 image) but fear it will trigger a new cascade of dependency changes... and I'm really out of my confort zone here...

Hope we can fix this soon! Thanks in advance

rjurney commented 5 years ago

I'm not upgrading Ubuntu at this time. I don't see these errors at all. How long did you wait until logging in? I don't see any of these errors, but it can take 20 minutes for setup to complete. I believe you logged in before most of the setup was completed. The downloads are large and take a lot of time.

ls -l /etc/apt/sources.list.d/
openjdk-r-ubuntu-ppa-artful.list

Everything works fine. I'm moving the setup of the login message to the end of the script so you will know if it is done or not.

Make sure to git pull origin master again after you receive this, there are more improvements from today.

rjurney commented 5 years ago

Just pushed a fix for the motd, it will say incomplete until it is loaded.

rjurney commented 5 years ago

Damn, broke it cause of script size... one sec.

rjurney commented 5 years ago

Ok, should work now. Testing again...

rjurney commented 5 years ago

I have just updated the Ubuntu images to 18.10 as I'm afraid the old ones will disappear. Testing now.

rjurney commented 5 years ago

The message upon ssh login now warns you if setup hasn't finished. If you want to watch progress, run: tail -f ec2_bootstrap.sh.log as the log for the install process is now in your home directory.

rjurney commented 5 years ago

Ok, I see this error now. Thanks. Working on it.

rjurney commented 5 years ago

Now that I upgraded the images to 18.10 again the java install works. To verify:

rm ./.ec2_hostname ./.reservation_id ./agile_data_science.pem
./ec2.sh
# 10 minutes later
./ec2_create_tunnel.sh

Visit http://localhost:8888/Welcome.ipynb

arnaudbouffard commented 5 years ago

pyspark, elasticsearch, they all work now. Thanks a lot for your responsiveness @rjurney 👍

rjurney commented 5 years ago

@arnaudbouffard Sorry it took a while, I am teaching a class next week so I had to update it. It is hard to keep this complex environment working at all times. The community has stepped up with maintaining Vagrant but EC2 is broken a lot of the time unfortunately. Check out Welcome.ipynb in the Jupyter notebooks now - it now has the entire book's example content linked so you can run it all on the instance. Enjoy! :)

JelenaSparic commented 4 years ago

Same issue again. Running EC2 instance by using ec2.sh, after connecting the error message: 'ERROR: JAVA_HOME /usr/lib/jvm/java-8-oracle does not exist.' shows up. Seems all application depends on Java is not installed as in the 'kafka/' and 'spark/' are both empty.