solliancenet / synapse-in-a-day-deployment

4 stars 9 forks source link

Demo setup: Realize Integrated Analytical Solutions with Azure Synapse Analytics

Contents

Requirements

  1. An Azure Account with the ability to create an Azure Synapse Workspace

  2. Make sure the following resource providers are registered for your Azure Subscription.

    • Microsoft.Sql
    • Microsoft.Synapse
    • Microsoft.StreamAnalytics
    • Microsoft.EventHub

    See further documentation for more information on registering resource providers on the Azure Portal.

  3. A Power BI Pro or Premium account to host Power BI reports, dashboards, and configuration of streaming datasets.

Environment setup instructions

Note:

The entire setup process will take from 1.5 to 2 hours to complete.

Azure Setup

Task 1: Create a Power BI workspace

Since Power BI workspaces must be unique within your tenant, you must first create the new workspace so you can use the same name when you create your Azure resource group in the next task.

  1. Sign in into the Power BI Portal using your Azure credentials.

  2. Select Workspaces in the left-hand menu (1), then select Create a workspace (2).

    The Workspaces menu item and Create a workspace button are highlighted.

  3. In the form, enter synapse-in-a-day-demos (no spaces or special characters) into the Workspace name field (append a unique value to the name if the workspace name is already in use), then select Save. Copy the workspace name and save it in Notepad or similar for later reference.

    The form is configured as described.

Task 2: Create a resource group in Azure

  1. Log into the Azure Portal using your Azure credentials.

  2. On the Azure Portal home screen, select the Menu button on the top-left corner (1). Hover over Resource groups (2), then select + Create (3).

    The Create button is highlighted.

  3. On the Create a resource group screen, select your desired Subscription and Region. For Resource group, enter the same name as your Power BI workspace name (such as synapse-in-a-day-demos), then select the Review + Create button.

    The Create a resource group form is displayed populated with Synapse-MCW as the resource group name.

  4. Select the Create button once validation has passed.

Important: Take note of the exact resource group name you provided for the steps that follow.

Task 3: Create an Azure VM for the deployment scripts

We highly recommend executing the PowerShell scripts on an Azure Virtual Machine instead of from your local machine. Doing so eliminates issues due to pre-existing dependencies and more importantly, network/bandwidth-related issues while executing the scripts.

  1. In the Azure portal, type in "virtual machines" in the top search menu and then select Virtual machines from the results.

    In the Services search result list, Virtual machines is selected.

  2. Select + Add on the Virtual machines page and then select the Virtual machine option.

  3. In the Basics tab, complete the following:

    Field Value
    Subscription select the appropriate subscription
    Resource group select synapse-in-a-day-demos
    Virtual machine name synapse-lab-setup-vm (or unique name if not available)
    Region select the resource group's location
    Availability options select No infrastructure redundancy required
    Image select Windows 10 Pro, Version 1809 - Gen1
    Azure Spot instance set to Unchecked
    Size _select Standard_D8s_v3_
    Username select labuser
    Password enter a password you will remember
    Public inbound ports select Allow selected ports
    Select inbound ports select RDP (3389)
    Licensing select the option to confirm that you have an eligible Windows 10 license with multi-tenant hosting rights.

    The form fields are completed with the previously described settings.

  4. Select Review + create. On the review screen, select Create. After the deployment completes, select Go to resource to go to the virtual machine.

    The Go to resource option is selected.

  5. Select Connect from the actions menu and choose RDP.

    The option to connect to the virtual machine via RDP is selected.

  6. On the Connect tab, select Download RDP File.

    Download the RDP file to connect to the Power BI virtual machine.

  7. Open the RDP file and select Connect to access the virtual machine. When prompted for credentials, enter labuser for the username and the password you chose.

    Connect to a remote host.

    Click Yes to connect despite security certificate errors when prompted.

    The Yes button is highlighted.

Task 4: Create Azure Synapse Analytics workspace

  1. Deploy the workspace through the following Azure ARM template (press the button below):

  2. On the Custom deployment form fill in the fields described below.

    • Subscription: Select your desired subscription for the deployment.

    • Resource group: Select the resource group you previously created.

    • Region: The region where your Azure Synapse environment will be created.

      Important: The Region field under 'Parameters' will list the Azure regions where Azure Synapse Analytics is available as of November 2020. This will help you find a region where the service is available without being limited to where the resource group is defined.

    • Unique Suffix: This unique suffix will be used naming resources that will created as part of your deployment. Make sure you follow correct Azure Resource naming conventions.

    • SQL Administrator Login Password: Provide a strong password for the SQLPool that will be created as part of your deployment. Visit here to read about password rules in place. Your password will be needed during the next steps. Make sure you have your password noted and secured.

  3. Select the Review + create button, then Create. The provisioning of your deployment resources will take approximately 13 minutes. Wait until provisioning successfully completes before continuing. You will need the resources in place before running the scripts below.

    Note: You may experience a deployment step failing in regards to Role Assignment. This error may safely be ignored.

Before starting

Steps & Timing

The entire script will take between 1.5 and 2 hours to complete. Major steps include:

Task 1: Pre-requisites

Install these pre-requisites on your deployment VM before continuing.

Task 2: Download artifacts and install PowerShell modules

Perform all of the steps below from your deployment VM:

  1. Open a PowerShell Window as an administrator, run the following command to download the artifacts

    mkdir c:\labfiles
    
    cd c:\labfiles
    
    git clone https://github.com/ctesta-oneillmsft/asa-vtd.git synapse-in-a-day-deployment

IMPORTANT

Task 3: Execute setup scripts

Perform all of the steps below from your deployment VM:

  1. You will be prompted to setup your Azure PowerShell and Azure CLI context.

  2. If you have more than one Azure Subscription, you will be prompted to enter the name of your desired Azure Subscription. You can copy and paste the value from the list to select one. For example:

    A subscription is copied and pasted into the text entry.

  3. Enter the name of the resource group you created at the beginning of the environment setup (such as synapse-in-a-day-demos). This will make sure automation runs against the correct environment you provisioned in Azure.

    During the execution of the automation script you may be prompted to approve installations from PS-Gallery. Please approve to proceed with the automation.

    The Azure Cloud Shell window is displayed with a sample of the output from the preceding command.

    NOTE This script will take between 90 and 150 minutes to complete.

Potential errors that you can ignore

You may encounter a few errors and warnings during the script execution. The errors below can safely be ignored:

  1. The following error may occur when creating SQL users and adding role assignments in the dedicated SQL pool, and can safely be ignored: Principal 'xxx@xxx.com' could not be created. Only connections established with Active Directory accounts can create other Active Directory users.

    Error is displayed.

  2. Errors when creating the Synapse notebooks (*.ipynb files) that state Unsupported operation: CreateOrUpdateNotebookResource can safely be ignored.

    Errors are displayed.

  3. Toward the end of the script, you may see the following error. If you do, it can be safely ignored:

    Starting PowerBI Artifact Provisioning
    Invoke-WebRequest : The response content cannot be parsed because the Internet Explorer engine is not available, or Internet Explorer's first-launch configuration is not complete. Specify the UseBasicParsing parameter and try again.
    At C:\labfiles\synapse-in-a-day-deployment\artifacts\environment-setup\solliance-synapse-automation\solliance-synapse-automation. char:15
    + ...   $result = Invoke-WebRequest -Uri $url -Method GET -ContentType "app ...
    +                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : NotImplemented: (:) [Invoke-WebRequest], NotSupportedException
        + FullyQualifiedErrorId : WebCmdletIEDomNotSupportedException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand
    
    Cannot index into a null array.
    At C:\labfiles\synapse-in-a-day-deployment\artifacts\environment-setup\solliance-synapse-automation\solliance-synapse-automation. char:5
    +     $homeCluster = $result.Headers["home-cluster-uri"]
    +     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
        + FullyQualifiedErrorId : NullArray

Task 4: Configure Power BI dataset credentials

Complete this task after setup has completed.

  1. Sign in into the Power BI Portal using your Azure credentials.

  2. From the hamburger menu select Workspaces to access the list of workspaces available to you. Select your workspace, which has the same name as your Azure resource group (eg. synapse-in-a-day-demos).

    The workspaces button from the hamburger menu is selected to list workspaces available.

  3. Select the Settings icon from the top right bar, and select Settings again to navigate to the settings page.

    The settings button on the Power BI portal clicked and the Settings selection on the context menu selected.

  4. Select datasets tab to access the list of datasets available. Then select 2-Billion Rows Demo dataset to access its settings. From the settings page open Data source credentials and select Edit credentials.

    The datasets tab is selected. From the list of datasets 2-Billion Rows Demo is selected. Edit credentials will be selected next.

  5. Select Microsoft Account for the Authentication method and select Sign In to complete the process.

    From the list of authentication methods Microsoft Account is picked. The sign in button is selected.

Task 5: Pause SQL pool

Note:

If you are not planning on using the Synapse workspace environment right away, follow the steps in this task to pause the SQL pool. Otherwise, you will incur potentially significant cost.

  1. Navigate to the resource group into which you deployed this environment.

  2. Select the Dedicated SQL pool (SQLPool01).

    The SQL pool is highlighted.

  3. Select || Pause to pause the pool.

    The pause button is highlighted.

Task 6: Delete lab setup VM

You no longer need the virtual machine if you created one for this lab setup.

  1. Open the VM in your Azure resource group, select Delete, then select Yes when prompted.

    The delete button is highlighted.