Open ndrluis opened 8 months ago
I made a test using version 0.34.62 and encountered the same error
I have added a partition:
time_dimension: datetime_created_at
granularity: day
partition_granularity: month
The smallest output is 652 MB and it takes 7 seconds to read, but it took 8 minutes to write. The other partitions have around 2 GB of output, and I'm receiving the 10-minute error.
I understand that using partitions is recommended, but 8 minutes to write a partition of 652 MB seems excessively long.
We've discussed this in Slack but leaving the comment here for transparency: it's really worth investigating what is happening in this particular use case but it looks like supporting an export bucket for Trino would be the best solution here.
Problem
Cube Version: v0.34.50
Hello, I’m trying to use Cube + Trino and pre-aggregate some data. I have a table where the output data is 10 GB, and I receive an error on my refresh worker with “Query execution timeout after 10 minutes of waiting”. For some reason, the connection continues, and the Cube Worker starts a new query without terminating the previous attempts.
I believe there is a bug in Cube + Trino that causes the connection to remain open when an error occurs.
However, my problem is that 10 GB seems too small to require so much time; the query to finish the scan takes 2 minutes, but more than 10 minutes to write the pre-aggregations is too slow. How can I improve this performance?
Logs:
Related Cube.js schema