tableau / server-client-python

A Python library for the Tableau Server REST API
https://tableau.github.io/server-client-python/
MIT License
648 stars 416 forks source link

Issue with Downloading Large Views from Tableau Server to CSV #1374

Open abhishek7575-spec opened 2 months ago

abhishek7575-spec commented 2 months ago

Describe the bug I'm encountering an issue when trying to download a view that is around 200 MB in size. My script works perfectly for smaller files (5 MB or 10 MB), but it keeps running indefinitely and doesn't write data into the CSV file for the larger view.

The script successfully logs "CSV populated," indicating that server.views.populate_csv completes, but the file is not written for larger views.

Here's the code I'm using:

def download_file(client, server, content_url, output_file, region):

    # NOTE: Monkey patched TSC.RequestOptions.Field class

    # since the contentUrl filter parameter was not supported

    setattr(TSC.RequestOptions.Field, 'ContentURL', 'contentUrl')

    req_option = TSC.RequestOptions()

    req_option.filter.add(TSC.Filter(TSC.RequestOptions.Field.ContentURL,

                     TSC.RequestOptions.Operator.Equals,

                     content_url))

    logger.info('Fetching views...')

    views, pagination_item = server.views.get(req_option)

    logger.info(f'Views fetched: {len(views)}')

    if len(views) == 1:

      only_matched_index = 0

      view = views[only_matched_index]

      logger.info('Populating CSV...')

      server.views.populate_csv(view, TSC.CSVRequestOptions(maxage = 5))

      logger.info('CSV populated.')

      with open(output_file, 'wb') as f:

        f.write(b''.join(view.csv))

      logger.info(f"CSV DATA WRITTEN TO the {output_file}")

OUTPUT:

INFO:main:Fetching views...

INFO:main:Views fetched: 1

INFO:main:Populating CSV...

INFO:main:CSV populated.

Could anyone suggest what might be going wrong with handling larger views? Are there any additional steps or considerations when dealing with larger files in Tableau Server that I might be missing?

Any insights or suggestions would be greatly appreciated!

Thank you!

Versions Details of your environment, including:

jacalata commented 1 month ago

Hm, not sure I've tried this. Most likely the problem is in the library here, and the request is timing out - there's no logic here to extend the session. I'll try and reproduce it to see exactly what happens.

abhishek7575-spec commented 1 month ago

@jacalata Thank you for looking into this and for trying to reproduce the issue. Looking forward to hearing back from you.

jorwoods commented 1 month ago

@abhishek7575-spec in v0.30, ContentUrl is absolutely available.

I would also consider changing the line f.write(b''.join(view.csv)). Its a nifty one-liner if the data is small enough and can easily fit into RAM, but you could run into problems with bigger files, as well as not being able to troubleshoot as easily. You could do this for example:

for i, chunk in enumerate(view.csv):
    logger.debug("Writing chunk %s", i)
    f.write(chunk)

It would give you log output so you can see that its still working and not have to load the entire file into RAM at once.