aws-samples / aws-concurrent-data-orchestration-pipeline-emr-livy

This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concurrent data pipeline by using Amazon EMR and Apache Livy. This pipeline is orchestrated by Apache Airflow.
Apache License 2.0
76 stars 33 forks source link

Set env variable AIRFLOW_GPL_UNIDECODE=yes when installing airflow #2

Open dennisylyung opened 5 years ago

dennisylyung commented 5 years ago

Avoids RuntimeError: By default one of Airflow's dependencies installs a GPL dependency (unidecode). To avoid this dependency set SLUGIFY_USES_TEXT_UNIDECODE=yes in your environment when you install or upgrade Airflow. To force installing the GPL version set AIRFLOW_GPL_UNIDECODE when running sudo pip install apache-airflow

Issue #1

Description of changes: In airflow.yaml.Resources.EC2Instance.Properties.UserData, changed sudo pip install apache-ariflow to sudo AIRFLOW_GPL_UNIDECODE=yes pip install apache-airflow, same to apache-airflow[crypto] and apache-airflow[postgres]