Nike-Inc / brickflow

Pythonic Programming Framework to orchestrate jobs in Databricks Workflow
https://engineering.nike.com/brickflow/
Apache License 2.0
183 stars 36 forks source link

TableauRefreshABCOperator, TableauRefreshDataSourceOperator and TableauRefreshWorkBookOperator #91

Closed maxim-mityutko closed 5 months ago

maxim-mityutko commented 6 months ago

Closes #90 Trigger refreshes of Tableau objects through TableauRefreshDataSourceOperator and TableauRefreshWorkBookOperator.

Description

TableauWrapper class provides the general functionality that is required to authenticate with Tableau server, identify object(s) that require refresh based on user friendly parameters (e.g. data source name) instead of ID and async trigger job refresh. This code can be executed separately as well.

TableauRefreshABCOperator introduces the abstract Airflow operator that handles result parsing.

TableauRefreshDataSourceOperator / TableauRefreshWorkBookOperator handle refreshes of the Tableau data sources and workbooks respectively.

Related Issue

https://github.com/Nike-Inc/brickflow/issues/90

Motivation and Context

On many occasions the result of the data pipeline should be consumed within the dashboard. Triggering Tableau objects refresh via these operators simplifies and streamlines the task.

How Has This Been Tested?

Custom wheel file used instead of Brickflow release from PyPi. The workflow has been created with 2 operators that connected to the Tableau server and triggered required payloads. Plus unit tests for the TableauWrapper class which contains the majority of the logic.

Screenshots (if appropriate):

Types of changes

Checklist:

asingamaneni commented 6 months ago

Looks like the tests failed

image
maxim-mityutko commented 6 months ago

Yeah, i'm looking into it. Were successful on my machine, but after I dropped the env and recreated getting an error as well.

maxim-mityutko commented 6 months ago

@asingamaneni the tests are working on local. I had to limit allowed pytest version to < 8.0.0

Prior to v8, the test_brickflow andtest_plugins.py were resolved and executed by PyTest in the beginning of the test session. On v8 they were the last, and test_plugins.py was failing.

I did a little bit of digging and noticed that they are failing due to this:

{
    'default': <brickflow.engine.task.DefaultBrickflowTaskPluginImpl object at 0x10fd820e0>, 
    '4559328096': <brickflow_plugins.airflow.brickflow_task_plugin.AirflowOperatorBrickflowTaskPluginImpl object at 0x10fd82080>, 
    'airflow-plugin': <brickflow_plugins.airflow.brickflow_task_plugin.AirflowOperatorBrickflowTaskPluginImpl object at 0x10fd82380>
}

The AirflowOperatorBrickflowTaskPluginImpl was loaded multiple times. This behaviour was consistent on my branch and on main, it seems that the numbered instance (4559328096 in the above example) was created by one of the tests that ran prior to test_plugins

maxim-mityutko commented 6 months ago

@asingamaneni @stikkireddy can you please retrigger the workflow or let me if anything else should be changed / added?

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 90.32%. Comparing base (548b6cb) to head (a6db2ae). Report is 4 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #91 +/- ## ========================================== + Coverage 90.27% 90.32% +0.05% ========================================== Files 22 22 Lines 3361 3381 +20 ========================================== + Hits 3034 3054 +20 Misses 327 327 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.