Get ready to enhance your Apache Airflow workflows with a new plugin designed for refreshing Power BI datasets! The plugin contains the custom operator to seamlessly handle dataset refresh and it supports SPN authentication. Additionally, the operator checks for existing refreshes before triggering the new one.
To utilize this operator, simply fork the repository locally and you're ready to roll.
Before diving in,
Since custom connection forms aren't feasible in Apache Airflow plugins, credentials must be stored in the secret backend. Here's what you need to store:
client_id
: The Client ID of your service principal.client_secret
: The Client Secret of your service principal.tenant_id
: The Tenant Id of your service principal.This operator composes the logic for this plugin. It triggers the Power BI dataset refresh and pushes the details in Xcom. It can accept the following parameters:
dataset_id
: The dataset Id.group_id
: The workspace Id.wait_for_termination
: (Default value: True) Wait until the pre-existing or current triggered refresh completes before exiting.force_refresh
: When enabled, it will force refresh the dataset again, after pre-existing ongoing refresh request is terminated.timeout
: Time in seconds to wait for a dataset to reach a terminal status for non-asynchronous waits. Used only if wait_for_termination
is True.check_interval
: Number of seconds to wait before rechecking the refresh status.powerbi_dataset_refresh_id
: Request Id of the Dataset Refresh.powerbi_dataset_refresh_status
: Refresh Status.
Unknown
: Refresh state is unknown or a refresh is in progress.Completed
: Refresh successfully completed.Failed
: Refresh failed (details in powerbi_dataset_refresh_error
).Disabled
: Refresh is disabled by a selective refresh.powerbi_dataset_refresh_end_time
: The end date and time of the refresh (may be None if a refresh is in progress)powerbi_dataset_refresh_error
: Failure error code in JSON format (None if no error)Ready to give it a spin? Check out the sample DAG code below:
from datetime import datetime
from airflow import DAG
from airflow.operators.bash import BashOperator
from operators.powerbi_refresh_dataset_operator import PowerBIDatasetRefreshOperator
with DAG(
dag_id='refresh_dataset_powerbi',
schedule_interval=None,
start_date=datetime(2023, 8, 7),
catchup=False,
concurrency=20,
tags=['powerbi', 'dataset', 'refresh']
) as dag:
refresh_in_given_workspace = PowerBIDatasetRefreshOperator(
task_id="refresh_in_given_workspace",
dataset_id="<dataset_id",
group_id="workspace_id",
force_refresh = False,
wait_for_termination = False
)
refresh_in_given_workspace
Feel free to tweak and tailor this DAG to suit your needs!
🌟 Please feel free to share any thoughts or suggestions you have.