prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.04k stars 5.37k forks source link

[presto 0.206] Exception while running TPCH queries with 500GB data. #11232

Open ajantha-bhat opened 6 years ago

ajantha-bhat commented 6 years ago

Hi, I am running TPCH queries with 500GB data on 3 node cluster [each node has 150GB query memory with 48 core CPU]. I have my own carbondata connector with presto.

I am using presto [0.206]

I got below exception for 5 queries out of 22 queries. Is anyone familiar with this exception call stack ? what is the workaround ?

java.lang.IllegalArgumentException: Too large (897278064 expected elements with load factor 0.75) at it.unimi.dsi.fastutil.HashCommon.arraySize(HashCommon.java:160) at com.facebook.presto.operator.PagesHash.(PagesHash.java:63) at com.facebook.presto.operator.JoinHashSupplier.(JoinHashSupplier.java:70) at com.facebook.presto.operator.PagesIndex.createLookupSourceSupplier(PagesIndex.java:512) at com.facebook.presto.operator.HashBuilderOperator.buildLookupSource(HashBuilderOperator.java:589) at com.facebook.presto.operator.HashBuilderOperator.finishInput(HashBuilderOperator.java:486) at com.facebook.presto.operator.HashBuilderOperator.finish(HashBuilderOperator.java:442) at com.facebook.presto.operator.Driver.processInternal(Driver.java:393) at com.facebook.presto.operator.Driver.lambda$processFor$8(Driver.java:282) at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:672) at com.facebook.presto.operator.Driver.processFor(Driver.java:276) at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:973) at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:162) at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:477) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

sopel39 commented 6 years ago

Could you post explain of the query? It seems that there is too many row for each HashBuilderOperator operator instance. You could try increasing number of nodes or increasing task concurrency.

ajantha-bhat commented 6 years ago

I cannot increase the nodes. How to increase the task concurrency?

On Fri 10 Aug, 2018, 2:13 PM Karol Sobczak, notifications@github.com wrote:

Could you post explain of the query? It seems that there is too many row for each HashBuilderOperator operator instance. You could try increasing number of nodes or increasing task concurrency.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/prestodb/presto/issues/11232#issuecomment-412018126, or mute the thread https://github.com/notifications/unsubscribe-auth/AFndfDNkcnQlU3pGUDeUlTi2ptYSMjOsks5uPUfOgaJpZM4V1igw .

sopel39 commented 6 years ago

set session task_concurrency=64 for instance

ajantha-bhat commented 6 years ago

I used task.concurrency = 64. with this instead of failing in a minute, it failed after 3 minutes.

And below is the explain result that you asked.

presto:tpchcarbon_default> explain select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' and o_orderdate >= date('1994-01-01') and o_orderdate < date('1995-01-01') group by n_name order by revenue desc; Query Plan

(1 row)

ajantha-bhat commented 6 years ago

I have attached the explain results. Even with task.concurrency=64. It didn't work

On Fri 10 Aug, 2018, 2:33 PM Karol Sobczak, notifications@github.com wrote:

set session task_concurrency=64 for instance

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/prestodb/presto/issues/11232#issuecomment-412023463, or mute the thread https://github.com/notifications/unsubscribe-auth/AFndfJ3dd8i7JuWb1Gw_Ynqxcy-QXX_8ks5uPUxygaJpZM4V1igw .

LeonBein commented 4 years ago

We have the same error on 1TB of TPCH files. Were you able to fix your issue? Also, #11563 and #3005 both seem to be the same issue but none of them have a concrete solution proposed ... (At least nothing that worked for us)

zhengxingmao commented 3 years ago

+1 I run tpcds 10T on 8 worker nodes and 1 condinator with 128GiB memory and 16 cores by Trino-360 version ,meet it again. Any effective solutions?