Azure-Synapse-Analytics-PoC
Description
Create a Synapse Analytics environment based on best practices to achieve a successful proof of concept. While settings can be adjusted,
the major deployment differences are based on whether or not you used Private Endpoints for connectivity. If you do not already use
Private Endpoints for other Azure deployments, it's discouraged to use them for a proof of concept as they have many other networking
depandancies than what can be configured here.
How to Run
These commands should be executed from the Azure Cloud Shell at https://shell.azure.com using PowerShell:
rm -rf Azure-Synapse-Analytics-PoC
git clone https://github.com/tonio-lora/Azure-Synapse-Analytics-PoC
cd Azure-Synapse-Analytics-PoC
bash setup.sh
bash configure.sh
./upload_sql_scripts.ps1
- There are a few variables in terraform.tfvars which could be optionally updated to reflect your environment (e.g. synapse_azure_ad_admin_upn) before you run the setup.sh script.
- setup.sh is the bash script that uses Terraform to deploy the environment. configure.sh performs post deployment configuration that cannot be done with Terraform.
What's Deployed
Azure Synapse Analytics Workspace
- DW1000 Dedicated SQL Pool
- Sample SQL Scripts and Spark Notebooks
- Metadata driven Data Loader pipeline to quickly onboard parquet files available in the Data Lake
Azure Data Lake Storage Gen2
- config container for Azure Synapse Analytics Workspace
- data container for queried/ingested data including AdventureWorksDW2019 in parquet format
Azure Log Analytics
- Logging and telemetry for Azure Synapse Analytics
- Logging and telemetry for Azure Data Lake Storage Gen2
What's Configured
- Enable Result Set Caching
- Create a pipeline to auto pause/resume the Dedicated SQL Pool
- Feature flag to enable/disable Private Endpoints
- Serverless SQL Demo Data Database
- Proper service and user permissions for Azure Synapse Analytics Workspace and Azure Data Lake Storage Gen2
- Parquet Auto Ingestion pipeline to optimize data ingestion using best practices
Optional Steps
- Load the sample parquet files into the Dedicated SQL pool. If you have addiotnal files, just add them to the Parquet_Auto_Ingestion_Metadata.csv stored in the data container
- Download the sample Power BI file from the Azure Cloud Shell and change the connection to use your new Synapse. This sample file includes a report that uses the tables loaded in the previous step