bryanthowell-tableau / tableau_tools

Package containing Tableau REST API, XML modification, tabcmd and repository tools
Other
214 stars 87 forks source link

Create a new extract data source from scratch #50

Closed ElPincheTopo closed 4 years ago

ElPincheTopo commented 5 years ago

I know this has been asked before in different ways but the previous issues don't really solve my issue. I have a python project that does my ETL process and sets up postgres tables that Tableau can use. After we create each table we have to manually go into Tableau Desktop and create a new data source that points to the new table set the data source to be an extract and the publish it to Tableau Server where we schedule the refreshes. I'm trying to automate the process of manually creating the data sources on Tableau desktop and then publish them to the server.

Following the docs and looking at the code I was able to do this:

new_tableau_file = TableauFile("test.tds", logger_obj=None, create_new=True, ds_version=u'10.5')
new_tableau_document = new_tableau_file.tableau_document
dses = new_tableau_document.datasources
ds = dses[0]
ds.add_new_connection(ds_type=u'postgres', server=hostname, db_or_schema_name=u'db_name')
conn = ds.connections[0]
conn.port = port
conn.username = username
ds.set_first_table(db_table_name=table_name, table_alias=table_alias, connection=conn.connection_name)
ds.add_column_alias('person_id', caption='Person ID', dimension_or_measure='dimension', discrete_or_continuous='discrete', datatype='integer', calculation=None)
ds.add_column_alias('name', caption='Name', dimension_or_measure='dimension', discrete_or_continuous='discrete', datatype='string', calculation=None)
new_tableau_document.save_file(datasource_name)

And it successfully creates a live connection to my table on my database. Now I want to be able to create a datasource that is an extract and that can be refreshed on a schedule on the server. I tried adding this create_extract(extract_name) just before saving the file, but it fails on save file with the following error:

Traceback (most recent call last):
  File "datasources.py", line 30, in create_tds
    new_tableau_document.save_file(datasource_name)
  File "/.virtualenv/lib/python3.7/site-packages/tableau_tools/tableau_documents/tableau_datasource.py", line 398, in save_file
    ds_string = self.get_datasource_xml()
  File "/.virtualenv/lib/python3.7/site-packages/tableau_tools/tableau_documents/tableau_datasource.py", line 356, in get_datasource_xml
    new_xml.remove(l)
TypeError: remove() argument must be xml.etree.ElementTree.Element, not None

I'm not sure if I'm doing something wrong or if what I want to do is not supported by the library. Could you help me out with this?

tableaukun commented 5 years ago

I've had the same issue. An option that works for me is to create a template Tableau file with a generic query/table and then alter it using Python. Documented here.

bryanthowell-tableau commented 4 years ago

tableaukun's solution is the better one. The "create_extract" stuff proved to be more problematic than it was worth, and it included dependencies on the Extract APIs which cannot be installed as regular requirements.

In the new 5.X release, the "create_extract" method has been deprecated, but all the manipulation methods remain if you need to change data source connection info or even the filters on the extract (after it has been created in Desktop in a template file).