datahub-project / datahub

The Metadata Platform for your Data Stack
https://datahubproject.io
Apache License 2.0
9.84k stars 2.9k forks source link

Tableau ingestion didn't avoid unpublished sheets. #6421

Open YuriyGavrilov opened 1 year ago

YuriyGavrilov commented 1 year ago

Describe the bug

The Tableau has the functionality to stay the sheets unpublished, nevertheless they will be used on one of the dashboards. There could be several unpublished sheets which in use on one dashbord. To avoid dublicates there should be an option to avoid unpublished sheets.

To Reproduce Steps to reproduce the behavior:

  1. Go to 'Tableau and create Dashboard based on few unpublished sheets.'
  2. Click on 'Tableau insetion RUN. '
  3. Go to the Ingestion results
  4. See all the sheets even it was marked as unpublished in Tabeau.

Expected behavior Unpublished sheet in Tableau shoued be avoided in Datahub ingestion by default or as option.

Screenshots Unpublish Tableau flag we see all the sheets and dashboards even it flaget unpublished

Desktop (please complete the following information):

Additional context we see in datahub all the sheets and dashboards even it flaget unpublished in Tableau.

YuriyGavrilov commented 5 months ago

@hsheth2 @treff7es May be it is time to take it?

Today we could have 50 tableau sheet with the name sheet1, sheet2, sheet3 ... and some sheet as old version of one so only one will be published on server but others exist in workbook but hidden.

I counted 1369 sheet* on our server

There is also good to fix https://github.com/datahub-project/datahub/issues/10121 to have full level of workbooks to simply navigate workbook and corespond sheets, chats, datasets, data source ...