RuiRomano / pbimonitor

MIT License
221 stars 83 forks source link

Intro

For more information please watch the session PBI Monitoring 101 (slides here)

This project aims to provide a solution to collect activity & catalog data from your Power BI tenant using powershell scripts and a Power BI templates to analyse all this data.

You can deploy the powershell scripts in two ways:

Requirements

Ensure you have the propper permissions

Create a Service Principal & Security Group

[!NOTE]
Azure Active Directory is now call Entra ID.

On Azure Active Directory:

  1. Go to "App Registrations" select "New App" and leave the default options
  2. Generate a new "Client Secret" on "Certificates & secrets" and save the Secret text
  3. Save the App Id & Tenant Id on the overview page of the service principal
  4. Create a new Security Group on Azure Active Directory and add the Service Principal above as member
  5. Optionally add the following API Application level permissions on "Microsoft Graph" API with Administrator grant to get the license & user info data:

    • User.Read.All
    • Directory.Read.All

      image image

      image

Authorize the Service Principal on PowerBI Tenant

As a Power BI Administrator go to the Power BI Tenant Settings and authorize the Security Group on the following tenant settings:

image

API's Used

Scope Resource API
Activity Power BI Activity Logs Admin API - Activity Events
Power BI Metadata Workspaces,DataSets,Reports,Dashboards,Permissions,Schema & Lineage Admin Scan API – GetModifiedWorkspaces; Admin Scan API – PostWorkspaceInfo; Admin Scan API – GetScanStatus (loop); Admin Scan API – GetScanResult
RefreshHistory Dataset Refresh History Admin API - GetGroupsAsAdmin + Expand DataSets; Dataset API - Get Refresh History
Users & Licenses Users & Licenses; Licenses Details Graph API – Users;Graph API – SubscribedSKUs
Tenant Settings Current Fabric Tenant Settings Fabric REST APIs / Tenants / Tenants - Get Tenant Settings



Setup - As an Azure Function

image *Fabric API used for Tenant settings but does not require any other permissions for your Service Principal

On an Azure Subscription create a resource group:

image

All the resources should be created in the same region as the Power BI Tenant, to see the region of the Power BI tenant go to the About page on powerbi.com:

image

Inside the Resource Group start a Function App Creation Wizard

image

Basics

image

Hosting

[!NOTE]
On a large Power BI tenant a dedicated plan might be needed because on consumption the functions have a 10 minute timeout. Learn more about timeouts here and how to extend the timeout configuration host.json here.

image

Monitoring

image

In the end the resource group shall have the following resources:

image

To deploy the Azure Function code you need to deploy the AzureFunction.zip package:

image

Open the Azure Function page, go to "Advanced Tools" and click "Go ➔" This will take you to a page called "Kudu Services"

image

Go to "Tools" -> "Zip Push Deploy" and drag & drop the file AzureFunction.zip:

image
image

Confirm if the deploy was successful:

image

Azure Function Configuration

Go back to the Azure Function page and click on "Configuration", and manually add the following settings:

Setting Value Description
PBIMONITOR_StorageConnStr Optional, only if you want to store data in a different storage from the Storage Account (setting 'AzureWebJobsStorage')
PBIMONITOR_AppDataPath C:\home\data\pbimonitor Path to AppData in Azure Function Disk, its where the state file is stored
PBIMONITOR_ScriptsPath C:\home\site\wwwroot\Scripts Path to scripts on Azure Function Disk
PBIMONITOR_ServicePrincipalId [YOUR SERVICE PRINCIPAL ID]
PBIMONITOR_ServicePrincipalSecret [YOUR SERVICE PRINCIPAL SECRET]
PBIMONITOR_ServicePrincipalTenantId [YOUR TENANT ID]
PBIMONITOR_ServicePrincipalEnvironment Public Power BI Cloud Environment
PBIMONITOR_StorageContainerName pbimonitor Name of the blob storage container
PBIMONITOR_StorageRootPath raw Path on the storage container
PBIMONITOR_FullScanAfterDays 30 Number of Days to repeat a full scan - Optimization to avoid reading too many scanner files on the Power BI Dataset
PBIMONITOR_CatalogGetModifiedParameters Optional, default: 'excludePersonalWorkspaces=false&excludeInActiveWorkspaces=true'
PBIMONITOR_CatalogGetInfoParameters Optional, default: 'lineage=true&datasourceDetails=true&getArtifactUsers=true&datasetSchema=false&datasetExpressions=false'
PBIMONITOR_GraphExtractGroups false Optional, if 'true' it will extract the members of the security groups to expand artifact permissions.

image image

Enable Azure Azure Key Vault (Optional)

Its possible to store the Service Principal secret in Azure Key Vault, see the documentation for more detail:

Create a system assigned managed identity for your Azure function:

image

Create your secrets in Azure Key Vault:

image

Add access policy for you system assigned managed identity created in your Azure function:

image

Grant "Get" under "Secret Permissions":

image

Reference your KeyVault on the Application Settings of Azure Function:

Setting Value
PBIMONITOR_ServicePrincipalId @Microsoft.KeyVault(VaultName=myvault;SecretName=appid)
PBIMONITOR_ServicePrincipalSecret @Microsoft.KeyVault(VaultName=myvault;SecretName=pbilog)
PBIMONITOR_ServicePrincipalTenantId @Microsoft.KeyVault(VaultName=myvault;SecretName=tenantid)

image

Azure Function Time Triggers

The Azure Function has 4 time trigger functions enabled by default:

Function Default Execution Description
AuditsTimer Everyday at 2AM Fetches activity data from the Actitivy API
CatalogTimer Everyday at 1AM Fetches metadata from the tenant: workspaces, datasets, reports,data sources
DatasetRefreshTimer Everyday at 5AM Fetches the refresh history of all datasets in workspaces where the service principal is a Member
GraphTimer Everyday at 4AM Fetches the User & License information from Graph API
TenantSettingsTimer Everyday at 4am Fetches Tenant Setting data from Fabric API

The function should be ready to run, go to the function page and open the “AuditsTimer” and Run it:

image image

Change Azure Function Time Trigger

Its possible to change the time of the trigger by changing the 'function.json' file for each timer using App Service Editor:

image

Or editing the timer integration:

image

Force a Full Scan

On large tenants you may run into memory issues reading all the data from a Power BI Dataset.

The PowerQuery of the PowerBI template was optimized to only read the scan files since the last full scan and the Azure Function setting 'PBIMONITOR_FullScanAfterDays' ensure a full scan will be executed every X days.

Its also possible to force a full scan by editing the State file (C:\home\data\pbimonitor\state.json) using Kudo

image

And remove properties: Catalog.LastRun, Catalog.LastFullScan (if exists)

image

Storage Account

If you dont want to use the built-in storage account of the Azure Function its possible to connect the Azure Function to another storage account by setting the connection string of the storage account in the configuration property: 'PBIMONITOR_StorageConnStr'

Power BI Report Template

Open the Power BI Report template PBI - Activity Monitor and set the parameters:

Change the parameter "DataLocation" and write the blob storage name:

image

You'll also need to copy the Access key from the Azure Portal:

image

And then paste it in the "Account key" box in the Azure Blob Storage credentials, which can be found in the Data Source Settings in Power BI Desktop:

image

image

image

image

Incremental Refresh

By default the Power BI template will read all the activity files from the storage account, but those files are not updatable and a possible optimization is to enable Incremental Refresh on the Activity Table.

The template is already prepared to support Incremental Refresh and filter only the new files, there is already a RangeStart & RangeEnd parameter:

image

image

On the "Activity" table, enable Incremental Refresh with the desired configuration:

image

The Dataset refresh should be significantly faster in the service after this configuration.

Setup - As a Local PowerShell

image

Install Required PowerShell Modules (as Administrator)

Install-Module -Name MicrosoftPowerBIMgmt -RequiredVersion 1.2.1026

Configuration file

Open the Config File and write the saved properties from the Service Principal:

image

Run

The file Fetch - Run is the entry point to call the other scripts.

Ensure Fetch - Run is targeting the proper configuration file (parameter "configFilePath") and you can also control which scripts are executed on the parameter $scriptsToRun

Run Fetch - Run

Open the Power BI Report Template

Open the Power BI Template file and change the parameter "DataLocation" to the data folder.