tableau / server-client-python

A Python library for the Tableau Server REST API
https://tableau.github.io/server-client-python/
MIT License
657 stars 422 forks source link

datasource_id from workbook connections does not match ids from datasources themselves #1194

Open alexepperly opened 1 year ago

alexepperly commented 1 year ago

Describe the bug

After populating workbook connections, the datasource_ids in workbooks.connections.datasource_id don't appear to match with any of the ids from the datasources.id fields in our instance of Tableau Server.

It's totally possible I am being a dunce somehow or that there is some "duh" solution to this, but I can't find it. A few of us at work have tried to figure this out, but we think there might be something buggy going on.

Versions

To Reproduce

After running these two functions (inside a class, hence all the selfs):


def populate_all_workbooks(self):

    self.all_workbooks = list(TSC.Pager(self.server.workbooks))

    # Populate all workbooks with connections
    for wb in self.all_workbooks:
        self.server.workbooks.populate_connections(wb)

# Create data frame with all values from workbook.connections
def get_workbook_connection_df(self):
    list_of_dicts = []
    for wb in self.all_workbooks:
        for connection in wb.connections:
            list_of_dicts.append({
                "id": connection.id,
                "Type": connection.connection_type,
                "ServerAddress": connection.server_address,
                "UserName": connection.username,
                "WorkbookId": wb.id,
                "DataSourceId": connection.datasource_id
                })
    return pd.DataFrame(list_of_dicts) 

Results

...Then we take the datasources.id from our most used datasource on our server instance (whose id I got directly in the XML with Postman, but also shows up querying it various ways through TSC). However, the following query yields nothing, zilch, nada:

# Our class is called TM

all_workbooks = TM.populate_all_workbooks()
workbook_connection_df = TM.get_workbook_connection_df()

# This yields no results. We did this many different ways, but this is the most concise version of the query
workbook_connection_df.query("DataSourceId == 'f78d82c3-c11f-49ce-ad8c-41b6e7d0990f'")

To be sure, we have looked for tons and tons of datasources.id's (in many different ways), and NONE of them are showing up in the list of workbooks.connections.datasource_id's that we have.

In fact, we created a dataframe of datasources similar to how we created the workbook_connections_df above, to wit:

  def get_datasource_df(self):
        list_of_dicts = []
        for ds in self.all_datasources:
            list_of_dicts.append({
                "id": ds.id,
                "Name": ds.name,
                "Type": ds.datasource_type
                })

        return pd.DataFrame(list_of_dicts)

...and none of the thousands of id's generated above are present when compared to the datasource_id values from workbook.connections.

We have tried doing this a ton of different ways, editing our classes, scripts, putting the values into SQL and querying/joining tables that way, and nothing seems to work.

Final TL;DR Version

The workbooks.connections.datasource_id property doesn't seem to match datasources.id.

(Or I'm doing something wrong.)

Thanks in advance for all your help!

jacalata commented 1 year ago

It's not you - that relationship is very strange and the ids do not represent the same thing. There are some (ugly) workarounds described in this open issue: https://github.com/tableau/server-client-python/issues/825

alexepperly commented 1 year ago

It's not you - that relationship is very strange and the ids do not represent the same thing. There are some (ugly) workarounds described in this open issue: #825

Thanks for the update. Working on getting access to the Metadata API.

Having said that are there any plans to rectify this in future releases of TSC?

Rohit2101991TU commented 1 year ago

Yes , observed the same . As the datasource ids are different for the datasource attached to a workbook and the same datasource when lookup by NAME and not id.For example Say DS1(which has 2 database connections in it) is connected to a Workbook W . Now , when we populate connections for W we see DS1(with a different datasource id - d1 and connection id - c1). Now when we lookup DS1 by NAME all_datasource_items, pagination_item = server.datasources.get() datasource_id=[ds.id for ds in all_datasource_items if ds.name == Tableau_datasource_name ]

we see that DS1 has different datasourceid and also 2 database connections within DS1 have connection ids ( c2 and c3) ) which is different from c1 . So in short , reference a datasource by its name and not datasourceid as the id is different