rjurney / Agile_Data_Code_2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
http://bit.ly/agile_data_science
MIT License
456 stars 306 forks source link

fix bad URLs for Kafka and Zeppelin, fix pip install for Airflow #91

Closed pjhinton closed 5 years ago

pjhinton commented 5 years ago

This commit contains another raft of fixes for the bootstrap.sh script.

The URLs for Kafka and Zeppelin refer to host www-us.apache.org which does not retain copies of archives indefinitely. As new releases drop, older ones disappear. They are still accessible by way of archive.apache.org.

https://archive.apache.org/

Updated the hostnames and URL schemes ot use HTTPS. Also noted that URL echoed for the kafka install had a path error wherein the version of a directory did not match that of a software release (1.0.0 veruss 0.10.1.1)

http://www-us.apache.org/dist/kafka/0.10.1.1/kafka_2.11-1.0.0.tgz

never worked at all.

The PyPi index name for Airflow is now apache-airflow. The old name, airflow, is now a placeholder that fails upon install. Also had to add an environment variable assignment/export to make the install work. Details on both of these issues can be found here:

https://pypi.org/project/airflow/ https://cwiki.apache.org/confluence/display/AIRFLOW/Announcements#Announcements-Nov21,2018

Finally commented out a line with repeated "=" characters as it appears to be just some spacing. It was causing "command not found errors" by the shell.

rjurney commented 5 years ago

Thanks!