Closed amartincolville closed 1 month ago
Very delayed reaction but @vogelsgesang do you know what's happening here? Has this been fixed?
We do know that the publish method creates 64MB chunks and only commits the data after all chunks are on the server, but it seems that the update_hyper_data does not do this
update_hyper_data
uses the exact same upload mechanism like publish
. As such, it also uses the same chunking.
there is a setting (api.server.extract-updates.max-size) that introduces a 10MB limitation to the payload
Did you try increasing this limit. After increasing it, your updates should go through
hey @jacalata the way to fix this is as @vogelsgesang advises, by chaning the api.server.extract-updates.max-size parameter, although this only applies if you have a Tableau Server on premises. In case you have Tableau Online, there is no possibility to change this.
I'm having the same problem but I'm using Tableau Online. Is there anything I can do ? @vogelsgesang
How large is the data you would like to upload, @felipe-costa-compado?
I think we will have to change the server-side Tableau Online configuration for this. Not sure if it is possible to increase this limit for a specific site or not. Maybe this will require a global change...
I will not be able to do that config change myself, I will have to involve a couple of colleagues. A business justification for this change would help us prioritize this change. To that end: Can you describe your use case? And also on whose behalf you are requesting a solution here (i.e., which Tableau customer/partner/reseller)?
(reopening, so we have this as an open issue on our list)
How large is the data you would like to upload, @felipe-costa-compado?
I think we will have to change the server-side Tableau Online configuration for this. Not sure if it is possible to increase this limit for a specific site or not. Maybe this will require a global change...
I will not be able to do that config change myself, I will have to involve a couple of colleagues. A business justification for this change would help us prioritize this change. To that end: Can you describe your use case? And also on whose behalf you are requesting a solution here (i.e., which Tableau customer/partner/reseller)?
(reopening, so we have this as an open issue on our list)
Hello @vogelsgesang,
Thank you for your reply, so we are trying to update a data source by doing a complete replace of this one, historically we have been doing it by using only TSC.publish() method, so the datasource was completely re-built every time. This has the disadvantage to remove any change that was done on Tableau side (calculated field, field alias, etc) so we wanted to give TSC.update_hyper_data() a try. We have 2 data source using hyper files, one is arround 2.5Go and the other one about 10 Go. Both are uploading correctly on tableau because we see the log: "File upload finished". But after there is the following error:
ok, that use case makes a lot of sense.
Going all the way to 10GB would push the limit pretty far. I am not currently sure why we even have this limit, so maybe we are able to actually increase it that far. I will have to wait for the reply of some other people from Tableau, though, because I am not completely sure what the ins and outs of this limit are.
In the meantime: Any customer name which I should associate with this request? Do you, e.g., already have a separate request with our customer support open on this topic?
The file attached exceeded the file size limit of '{0}'.
This seems like another bug. The {0}
is intended to be a placeholder, and you actually see the currently configured limit there...
be a placeholder, and you actually see the currently configur
ok, that use case makes a lot of sense.
Going all the way to 10GB would push the limit pretty far. I am not currently sure why we even have this limit, so maybe we are able to actually increase it that far. I will have to wait for the reply of some other people from Tableau, though, because I am not completely sure what the ins and outs of this limit are.
In the meantime: Any customer name which I should associate with this request? Do you, e.g., already have a separate request with our customer support open on this topic?
I didn't open any ticket with the support. You can open in the name of Compado.
thanks for following up on this - the increase seems a simple change on an environment variable, hopefully this could be change for Tableau Online users.
I can confirm that this limit makes incremental updates unfeasible for larger datasets. I am trying to insert a delta of about 1GB every day, and uploading that in chunks takes multiple times longer than simply replacing the entire dataset of 15GB (2 hours vs 30 minutes).
However, this states that the payload limit on Tableau Cloud is fixed at 100MB to limit server strain - does this mean that we will not get a fix for this?
To my understanding, this doesn't seem optimal. Incremental updates should leverage the hyper engine to minimize strain on the server. If a query is causing too much strain, the server should allocate the resources accordingly. Making the user send hundreds of small chunks is the opposite of limiting server strain - it maximizes the strain because the hyper engine has to process hundreds of queries without the ability to optimize away redundant workloads.
be a placeholder, and you actually see the currently configur
ok, that use case makes a lot of sense. Going all the way to 10GB would push the limit pretty far. I am not currently sure why we even have this limit, so maybe we are able to actually increase it that far. I will have to wait for the reply of some other people from Tableau, though, because I am not completely sure what the ins and outs of this limit are. In the meantime: Any customer name which I should associate with this request? Do you, e.g., already have a separate request with our customer support open on this topic?
I didn't open any ticket with the support. You can open in the name of Compado.
You found any solution? Cause I using python to pick my hyper data sources and uploading to Tableau Cloud, but there is one data source that run once per day, that have 1GB. It is hitting the 8h windows task scheduler time out. We changed to this method because of MFA, is the only task giving problem, others datasources are okay, one or two that have increase the update in 15, 30 min because of the time to python to publish.
hey @jacalata the way to fix this is as @vogelsgesang advises, by chaning the api.server.extract-updates.max-size parameter, although this only applies if you have a Tableau Server on premises. In case you have Tableau Online, there is no possibility to change this.
Hey @amartincolville do you know how to change this paramter? I've search the Tableau docs and Google but I can't seem to find a way to change it anywhere. I'm using Tableau Server on Prem. Thank you!
This limit was increased on both Tableau Cloud and Server, to a new default value of 100MB. More detail is in the API docs at https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api_how_to_update_data_to_hyper.htm
@joshuadienye - use tsm to set that variable, same as the ones listed on this page https://help.tableau.com/current/server/en-us/cli_configuration-set_tsm.htm
Describe the bug We are currently using Tableau to enable our analysts to extract data and create workbooks and dashboard on datasources. Due to the limitations of the Tableau UI, as well as the willingness to integrate this within our architecture, we are using Tableau Server Client and the Hyper API to allow the dynamic creation and refresh of datasources given a set of parameters or arguments. The setup mainly uses the
TSC.publish
method to create an initial datasource with the required historical data and, once this is published, use theTSC.update_hyper_data
method to update smaller amounts of data with a "sliding window" approach. Our current setup is working fine when working with small amounts of data, although if we intend to work with larger datasets, which is the most common use case, we have stumble upon aPayload Too Large
error, given from the Tableau Server. After some research, we have seen that there is a setting (api.server.extract-updates.max-size
) that introduces a 10MB limitation to the payload, which in our case is a Hyper file. We do know that thepublish
method creates 64MB chunks and only commits the data after all chunks are on the server, but it seems that theupdate_hyper_data
does not do this and the restrictions are too high for us to ensure a correct refresh when trying to do this "sliding winow". Is there a way this can be surpassed? Are we missing any concepts or are we not grasping the whole functionality of this?Versions Details of your environment, including:
To Reproduce
update_hyper_data
method with the latter .hyper file as payload and areplace
actionResults Job failed
NOTE: Be careful not to post user names, passwords, auth tokens or any other private or sensitive information.