pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.35k stars 5.85k forks source link

tpch query 8 is slow #7018

Open zhexuany opened 6 years ago

zhexuany commented 6 years ago

We already knew query ~17~ 8 is slow but we just did not the reason. Since query 17 will meet oom error in current master. With the new feature trace, we can better understand our query execution process.

shenli commented 6 years ago

Any update?

zhexuany commented 6 years ago

Report: why query 8 is so slow.

Collecting trace result is easy just add trace to the original query 8. Original trace output is huge. Trace produces 88MB data.

▶ ls -hl trace.out
-rw-r--r-- 1 zhexuany staff 88M Aug  3 13:59 trace.out

What we really ~card~ care in this output is only those operations which cost most time during query execution process.

▶ grep  "hash_agg_exec" trace.out
2018-08-03T13:48:27+08:00   54.199107021s   hash_agg_exec_partial_worker_12 85052   event:update partial result
2018-08-03T13:48:27+08:00   54.199109521s   hash_agg_exec_partial_worker_12 85052   event:iterating chunk's data
2018-08-03T13:48:53+08:00   1m19.865761853s hash_agg_exec_partial_worker_12 85052   event:update partial result
2018-08-03T13:48:53+08:00   1m19.865779353s hash_agg_exec_partial_worker_12 85052   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.932879673s hash_agg_exec_partial_worker_12 85052   event:update partial result
2018-08-03T13:49:21+08:00   1m47.932890974s hash_agg_exec_partial_worker_12 85052   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.889405323s hash_agg_exec_partial_worker_12 85052   event:update partial result
2018-08-03T13:49:44+08:00   2m10.889407323s hash_agg_exec_partial_worker_12 85052   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.358247546s hash_agg_exec_partial_worker_12 85052   event:update partial result
2018-08-03T13:50:11+08:00   2m37.358260346s hash_agg_exec_partial_worker_12 85052   event:iterating chunk's data
2018-08-03T13:50:48+08:00   3m15.076101187s hash_agg_exec_partial_worker_12 85052   event:update partial result
2018-08-03T13:50:48+08:00   3m15.076108388s hash_agg_exec_partial_worker_12 85052   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.224444797s hash_agg_exec_partial_worker_12 85052   event:update partial result
2018-08-03T13:51:14+08:00   3m41.224447397s hash_agg_exec_partial_worker_12 85052   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260471764s hash_agg_exec_partial_worker_12 85052   event:update partial result
2018-08-03T13:51:14+08:00   3m41.260473964s hash_agg_exec_partial_worker_12 85052   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260785076s hash_agg_exec_partial_worker_12 85052   event:shuffling interm data
2018-08-03T13:48:21+08:00   48.174833878s   hash_agg_exec_partial_worker_8  85053   event:update partial result
2018-08-03T13:48:21+08:00   48.174836278s   hash_agg_exec_partial_worker_8  85053   event:iterating chunk's data
2018-08-03T13:48:53+08:00   1m19.857584443s hash_agg_exec_partial_worker_8  85053   event:update partial result
2018-08-03T13:48:53+08:00   1m19.857588343s hash_agg_exec_partial_worker_8  85053   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.924565058s hash_agg_exec_partial_worker_8  85053   event:update partial result
2018-08-03T13:49:21+08:00   1m47.924577859s hash_agg_exec_partial_worker_8  85053   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.880955303s hash_agg_exec_partial_worker_8  85053   event:update partial result
2018-08-03T13:49:44+08:00   2m10.880957603s hash_agg_exec_partial_worker_8  85053   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.33225686s  hash_agg_exec_partial_worker_8  85053   event:update partial result
2018-08-03T13:50:11+08:00   2m37.332281361s hash_agg_exec_partial_worker_8  85053   event:iterating chunk's data
2018-08-03T13:50:42+08:00   3m8.307054863s  hash_agg_exec_partial_worker_8  85053   event:update partial result
2018-08-03T13:50:42+08:00   3m8.307061563s  hash_agg_exec_partial_worker_8  85053   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.213769493s hash_agg_exec_partial_worker_8  85053   event:update partial result
2018-08-03T13:51:14+08:00   3m41.213776293s hash_agg_exec_partial_worker_8  85053   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260196954s hash_agg_exec_partial_worker_8  85053   event:update partial result
2018-08-03T13:51:14+08:00   3m41.260198654s hash_agg_exec_partial_worker_8  85053   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260861079s hash_agg_exec_partial_worker_8  85053   event:shuffling interm data
2018-08-03T13:48:21+08:00   48.167630105s   hash_agg_exec_partial_worker_5  85054   event:update partial result
2018-08-03T13:48:21+08:00   48.167633105s   hash_agg_exec_partial_worker_5  85054   event:iterating chunk's data
2018-08-03T13:48:44+08:00   1m10.541929127s hash_agg_exec_partial_worker_5  85054   event:update partial result
2018-08-03T13:48:44+08:00   1m10.541937128s hash_agg_exec_partial_worker_5  85054   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.917161778s hash_agg_exec_partial_worker_5  85054   event:update partial result
2018-08-03T13:49:21+08:00   1m47.917173478s hash_agg_exec_partial_worker_5  85054   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.870273098s hash_agg_exec_partial_worker_5  85054   event:update partial result
2018-08-03T13:49:44+08:00   2m10.870279698s hash_agg_exec_partial_worker_5  85054   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.323135514s hash_agg_exec_partial_worker_5  85054   event:update partial result
2018-08-03T13:50:11+08:00   2m37.323140815s hash_agg_exec_partial_worker_5  85054   event:iterating chunk's data
2018-08-03T13:50:42+08:00   3m8.299631081s  hash_agg_exec_partial_worker_5  85054   event:update partial result
2018-08-03T13:50:42+08:00   3m8.299640582s  hash_agg_exec_partial_worker_5  85054   event:iterating chunk's data
2018-08-03T13:51:06+08:00   3m32.330152132s hash_agg_exec_partial_worker_5  85054   event:update partial result
2018-08-03T13:51:06+08:00   3m32.330154632s hash_agg_exec_partial_worker_5  85054   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.259772138s hash_agg_exec_partial_worker_5  85054   event:update partial result
2018-08-03T13:51:14+08:00   3m41.259779539s hash_agg_exec_partial_worker_5  85054   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260905181s hash_agg_exec_partial_worker_5  85054   event:shuffling interm data
2018-08-03T13:48:21+08:00   48.168938355s   hash_agg_exec_partial_worker_6  85055   event:update partial result
2018-08-03T13:48:21+08:00   48.168940955s   hash_agg_exec_partial_worker_6  85055   event:iterating chunk's data
2018-08-03T13:48:44+08:00   1m10.547678946s hash_agg_exec_partial_worker_6  85055   event:update partial result
2018-08-03T13:48:44+08:00   1m10.547687646s hash_agg_exec_partial_worker_6  85055   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.919013248s hash_agg_exec_partial_worker_6  85055   event:update partial result
2018-08-03T13:49:21+08:00   1m47.919025049s hash_agg_exec_partial_worker_6  85055   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.87243918s  hash_agg_exec_partial_worker_6  85055   event:update partial result
2018-08-03T13:49:44+08:00   2m10.872441381s hash_agg_exec_partial_worker_6  85055   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.323769939s hash_agg_exec_partial_worker_6  85055   event:update partial result
2018-08-03T13:50:11+08:00   2m37.323776739s hash_agg_exec_partial_worker_6  85055   event:iterating chunk's data
2018-08-03T13:50:42+08:00   3m8.301783463s  hash_agg_exec_partial_worker_6  85055   event:update partial result
2018-08-03T13:50:42+08:00   3m8.301790563s  hash_agg_exec_partial_worker_6  85055   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.208181381s hash_agg_exec_partial_worker_6  85055   event:update partial result
2018-08-03T13:51:14+08:00   3m41.208187981s hash_agg_exec_partial_worker_6  85055   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.259890543s hash_agg_exec_partial_worker_6  85055   event:update partial result
2018-08-03T13:51:14+08:00   3m41.259892943s hash_agg_exec_partial_worker_6  85055   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260910382s hash_agg_exec_partial_worker_6  85055   event:shuffling interm data
2018-08-03T13:48:21+08:00   48.170763723s   hash_agg_exec_partial_worker_7  85056   event:update partial result
2018-08-03T13:48:21+08:00   48.170765723s   hash_agg_exec_partial_worker_7  85056   event:iterating chunk's data
2018-08-03T13:48:53+08:00   1m19.853359782s hash_agg_exec_partial_worker_7  85056   event:update partial result
2018-08-03T13:48:53+08:00   1m19.853363082s hash_agg_exec_partial_worker_7  85056   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.920628409s hash_agg_exec_partial_worker_7  85056   event:update partial result
2018-08-03T13:49:21+08:00   1m47.920637209s hash_agg_exec_partial_worker_7  85056   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.874134044s hash_agg_exec_partial_worker_7  85056   event:update partial result
2018-08-03T13:49:44+08:00   2m10.874136644s hash_agg_exec_partial_worker_7  85056   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.330017775s hash_agg_exec_partial_worker_7  85056   event:update partial result
2018-08-03T13:50:11+08:00   2m37.330043576s hash_agg_exec_partial_worker_7  85056   event:iterating chunk's data
2018-08-03T13:50:42+08:00   3m8.303737336s  hash_agg_exec_partial_worker_7  85056   event:update partial result
2018-08-03T13:50:42+08:00   3m8.303739737s  hash_agg_exec_partial_worker_7  85056   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.210094653s hash_agg_exec_partial_worker_7  85056   event:update partial result
2018-08-03T13:51:14+08:00   3m41.210102053s hash_agg_exec_partial_worker_7  85056   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260024547s hash_agg_exec_partial_worker_7  85056   event:update partial result
2018-08-03T13:51:14+08:00   3m41.260026547s hash_agg_exec_partial_worker_7  85056   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260911381s hash_agg_exec_partial_worker_7  85056   event:shuffling interm data
2018-08-03T13:48:21+08:00   48.151350087s   hash_agg_exec_partial_worker_1  85057   event:update partial result
2018-08-03T13:48:21+08:00   48.151352287s   hash_agg_exec_partial_worker_1  85057   event:iterating chunk's data
2018-08-03T13:48:44+08:00   1m10.527457378s hash_agg_exec_partial_worker_1  85057   event:update partial result
2018-08-03T13:48:44+08:00   1m10.527471778s hash_agg_exec_partial_worker_1  85057   event:iterating chunk's data
2018-08-03T13:49:10+08:00   1m37.217291639s hash_agg_exec_partial_worker_1  85057   event:update partial result
2018-08-03T13:49:10+08:00   1m37.217303239s hash_agg_exec_partial_worker_1  85057   event:iterating chunk's data
2018-08-03T13:49:37+08:00   2m3.494292058s  hash_agg_exec_partial_worker_1  85057   event:update partial result
2018-08-03T13:49:37+08:00   2m3.494298658s  hash_agg_exec_partial_worker_1  85057   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.293413286s hash_agg_exec_partial_worker_1  85057   event:update partial result
2018-08-03T13:50:11+08:00   2m37.293426087s hash_agg_exec_partial_worker_1  85057   event:iterating chunk's data
2018-08-03T13:50:42+08:00   3m8.283236559s  hash_agg_exec_partial_worker_1  85057   event:update partial result
2018-08-03T13:50:42+08:00   3m8.283238959s  hash_agg_exec_partial_worker_1  85057   event:iterating chunk's data
2018-08-03T13:51:06+08:00   3m32.305702604s hash_agg_exec_partial_worker_1  85057   event:update partial result
2018-08-03T13:51:06+08:00   3m32.305719405s hash_agg_exec_partial_worker_1  85057   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.259352922s hash_agg_exec_partial_worker_1  85057   event:update partial result
2018-08-03T13:51:14+08:00   3m41.259360222s hash_agg_exec_partial_worker_1  85057   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260955683s hash_agg_exec_partial_worker_1  85057   event:shuffling interm data
2018-08-03T13:48:27+08:00   54.201505212s   hash_agg_exec_partial_worker_13 85058   event:update partial result
2018-08-03T13:48:27+08:00   54.201509413s   hash_agg_exec_partial_worker_13 85058   event:iterating chunk's data
2018-08-03T13:48:53+08:00   1m19.872417705s hash_agg_exec_partial_worker_13 85058   event:update partial result
2018-08-03T13:48:53+08:00   1m19.872421205s hash_agg_exec_partial_worker_13 85058   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.934661041s hash_agg_exec_partial_worker_13 85058   event:update partial result
2018-08-03T13:49:21+08:00   1m47.934664241s hash_agg_exec_partial_worker_13 85058   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.892812553s hash_agg_exec_partial_worker_13 85058   event:update partial result
2018-08-03T13:49:44+08:00   2m10.892814853s hash_agg_exec_partial_worker_13 85058   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.368157322s hash_agg_exec_partial_worker_13 85058   event:update partial result
2018-08-03T13:50:11+08:00   2m37.368162222s hash_agg_exec_partial_worker_13 85058   event:iterating chunk's data
2018-08-03T13:50:48+08:00   3m15.079647722s hash_agg_exec_partial_worker_13 85058   event:update partial result
2018-08-03T13:50:48+08:00   3m15.079654222s hash_agg_exec_partial_worker_13 85058   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.22767242s  hash_agg_exec_partial_worker_13 85058   event:update partial result
2018-08-03T13:51:14+08:00   3m41.22767992s  hash_agg_exec_partial_worker_13 85058   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260640871s hash_agg_exec_partial_worker_13 85058   event:update partial result
2018-08-03T13:51:14+08:00   3m41.260642671s hash_agg_exec_partial_worker_13 85058   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.26087498s  hash_agg_exec_partial_worker_13 85058   event:shuffling interm data
2018-08-03T13:48:21+08:00   48.157451919s   hash_agg_exec_partial_worker_2  85059   event:update partial result
2018-08-03T13:48:21+08:00   48.157454819s   hash_agg_exec_partial_worker_2  85059   event:iterating chunk's data
2018-08-03T13:48:44+08:00   1m10.530780904s hash_agg_exec_partial_worker_2  85059   event:update partial result
2018-08-03T13:48:44+08:00   1m10.530789305s hash_agg_exec_partial_worker_2  85059   event:iterating chunk's data
2018-08-03T13:49:10+08:00   1m37.220546163s hash_agg_exec_partial_worker_2  85059   event:update partial result
2018-08-03T13:49:10+08:00   1m37.220559764s hash_agg_exec_partial_worker_2  85059   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.857947231s hash_agg_exec_partial_worker_2  85059   event:update partial result
2018-08-03T13:49:44+08:00   2m10.857959931s hash_agg_exec_partial_worker_2  85059   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.301183582s hash_agg_exec_partial_worker_2  85059   event:update partial result
2018-08-03T13:50:11+08:00   2m37.301199382s hash_agg_exec_partial_worker_2  85059   event:iterating chunk's data
2018-08-03T13:50:42+08:00   3m8.289103582s  hash_agg_exec_partial_worker_2  85059   event:update partial result
2018-08-03T13:50:42+08:00   3m8.289106082s  hash_agg_exec_partial_worker_2  85059   event:iterating chunk's data
2018-08-03T13:51:06+08:00   3m32.312217452s hash_agg_exec_partial_worker_2  85059   event:update partial result
2018-08-03T13:51:06+08:00   3m32.312226952s hash_agg_exec_partial_worker_2  85059   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.259525429s hash_agg_exec_partial_worker_2  85059   event:update partial result
2018-08-03T13:51:14+08:00   3m41.259527929s hash_agg_exec_partial_worker_2  85059   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260983784s hash_agg_exec_partial_worker_2  85059   event:shuffling interm data
2018-08-03T13:48:21+08:00   48.140525876s   hash_agg_exec_partial_worker_0  85060   event:update partial result
2018-08-03T13:48:21+08:00   48.140528176s   hash_agg_exec_partial_worker_0  85060   event:iterating chunk's data
2018-08-03T13:48:44+08:00   1m10.514955703s hash_agg_exec_partial_worker_0  85060   event:update partial result
2018-08-03T13:48:44+08:00   1m10.514960403s hash_agg_exec_partial_worker_0  85060   event:iterating chunk's data
2018-08-03T13:49:10+08:00   1m37.204161241s hash_agg_exec_partial_worker_0  85060   event:update partial result
2018-08-03T13:49:10+08:00   1m37.204177941s hash_agg_exec_partial_worker_0  85060   event:iterating chunk's data
2018-08-03T13:49:37+08:00   2m3.492745199s  hash_agg_exec_partial_worker_0  85060   event:update partial result
2018-08-03T13:49:37+08:00   2m3.492751699s  hash_agg_exec_partial_worker_0  85060   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.286455022s hash_agg_exec_partial_worker_0  85060   event:update partial result
2018-08-03T13:50:11+08:00   2m37.286467022s hash_agg_exec_partial_worker_0  85060   event:iterating chunk's data
2018-08-03T13:50:42+08:00   3m8.27614619s   hash_agg_exec_partial_worker_0  85060   event:update partial result
2018-08-03T13:50:42+08:00   3m8.27614889s   hash_agg_exec_partial_worker_0  85060   event:iterating chunk's data
2018-08-03T13:51:06+08:00   3m32.301631549s hash_agg_exec_partial_worker_0  85060   event:update partial result
2018-08-03T13:51:06+08:00   3m32.30163495s  hash_agg_exec_partial_worker_0  85060   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.25931712s  hash_agg_exec_partial_worker_0  85060   event:update partial result
2018-08-03T13:51:14+08:00   3m41.259325121s hash_agg_exec_partial_worker_0  85060   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.261008385s hash_agg_exec_partial_worker_0  85060   event:shuffling interm data
2018-08-03T13:48:27+08:00   54.193617613s   hash_agg_exec_partial_worker_10 85061   event:update partial result
2018-08-03T13:48:27+08:00   54.193630914s   hash_agg_exec_partial_worker_10 85061   event:iterating chunk's data
2018-08-03T13:48:53+08:00   1m19.862296221s hash_agg_exec_partial_worker_10 85061   event:update partial result
2018-08-03T13:48:53+08:00   1m19.862307822s hash_agg_exec_partial_worker_10 85061   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.92884322s  hash_agg_exec_partial_worker_10 85061   event:update partial result
2018-08-03T13:49:21+08:00   1m47.928853621s hash_agg_exec_partial_worker_10 85061   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.885569478s hash_agg_exec_partial_worker_10 85061   event:update partial result
2018-08-03T13:49:44+08:00   2m10.885571478s hash_agg_exec_partial_worker_10 85061   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.346507601s hash_agg_exec_partial_worker_10 85061   event:update partial result
2018-08-03T13:50:11+08:00   2m37.346517801s hash_agg_exec_partial_worker_10 85061   event:iterating chunk's data
2018-08-03T13:50:48+08:00   3m15.070167463s hash_agg_exec_partial_worker_10 85061   event:update partial result
2018-08-03T13:50:48+08:00   3m15.070169863s hash_agg_exec_partial_worker_10 85061   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.219221799s hash_agg_exec_partial_worker_10 85061   event:update partial result
2018-08-03T13:51:14+08:00   3m41.219223799s hash_agg_exec_partial_worker_10 85061   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260388561s hash_agg_exec_partial_worker_10 85061   event:update partial result
2018-08-03T13:51:14+08:00   3m41.260390461s hash_agg_exec_partial_worker_10 85061   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260936782s hash_agg_exec_partial_worker_10 85061   event:shuffling interm data
2018-08-03T13:48:27+08:00   54.195477084s   hash_agg_exec_partial_worker_11 85062   event:update partial result
2018-08-03T13:48:27+08:00   54.195479584s   hash_agg_exec_partial_worker_11 85062   event:iterating chunk's data
2018-08-03T13:48:53+08:00   1m19.863970185s hash_agg_exec_partial_worker_11 85062   event:update partial result
2018-08-03T13:48:53+08:00   1m19.863978486s hash_agg_exec_partial_worker_11 85062   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.931047304s hash_agg_exec_partial_worker_11 85062   event:update partial result
2018-08-03T13:49:21+08:00   1m47.931058705s hash_agg_exec_partial_worker_11 85062   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.887790263s hash_agg_exec_partial_worker_11 85062   event:update partial result
2018-08-03T13:49:44+08:00   2m10.887793663s hash_agg_exec_partial_worker_11 85062   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.357037701s hash_agg_exec_partial_worker_11 85062   event:update partial result
2018-08-03T13:50:11+08:00   2m37.357047801s hash_agg_exec_partial_worker_11 85062   event:iterating chunk's data
2018-08-03T13:50:48+08:00   3m15.0737669s   hash_agg_exec_partial_worker_11 85062   event:update partial result
2018-08-03T13:50:48+08:00   3m15.0737697s   hash_agg_exec_partial_worker_11 85062   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.222628529s hash_agg_exec_partial_worker_11 85062   event:update partial result
2018-08-03T13:51:14+08:00   3m41.222631529s hash_agg_exec_partial_worker_11 85062   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260500466s hash_agg_exec_partial_worker_11 85062   event:update partial result
2018-08-03T13:51:14+08:00   3m41.260502466s hash_agg_exec_partial_worker_11 85062   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260933082s hash_agg_exec_partial_worker_11 85062   event:shuffling interm data
2018-08-03T13:48:21+08:00   48.176240132s   hash_agg_exec_partial_worker_9  85063   event:update partial result
2018-08-03T13:48:21+08:00   48.176243432s   hash_agg_exec_partial_worker_9  85063   event:iterating chunk's data
2018-08-03T13:48:53+08:00   1m19.859669622s hash_agg_exec_partial_worker_9  85063   event:update partial result
2018-08-03T13:48:53+08:00   1m19.859677323s hash_agg_exec_partial_worker_9  85063   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.92696445s  hash_agg_exec_partial_worker_9  85063   event:update partial result
2018-08-03T13:49:21+08:00   1m47.92697165s  hash_agg_exec_partial_worker_9  85063   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.883179788s hash_agg_exec_partial_worker_9  85063   event:update partial result
2018-08-03T13:49:44+08:00   2m10.883182388s hash_agg_exec_partial_worker_9  85063   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.338443695s hash_agg_exec_partial_worker_9  85063   event:update partial result
2018-08-03T13:50:11+08:00   2m37.338449396s hash_agg_exec_partial_worker_9  85063   event:iterating chunk's data
2018-08-03T13:50:48+08:00   3m15.068325593s hash_agg_exec_partial_worker_9  85063   event:update partial result
2018-08-03T13:50:48+08:00   3m15.068335694s hash_agg_exec_partial_worker_9  85063   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.217883049s hash_agg_exec_partial_worker_9  85063   event:update partial result
2018-08-03T13:51:14+08:00   3m41.217890149s hash_agg_exec_partial_worker_9  85063   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.26034356s  hash_agg_exec_partial_worker_9  85063   event:update partial result
2018-08-03T13:51:14+08:00   3m41.26034626s  hash_agg_exec_partial_worker_9  85063   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260961484s hash_agg_exec_partial_worker_9  85063   event:shuffling interm data
2018-08-03T13:48:21+08:00   48.161920188s   hash_agg_exec_partial_worker_3  85064   event:update partial result
2018-08-03T13:48:21+08:00   48.161922888s   hash_agg_exec_partial_worker_3  85064   event:iterating chunk's data
2018-08-03T13:48:44+08:00   1m10.534883059s hash_agg_exec_partial_worker_3  85064   event:update partial result
2018-08-03T13:48:44+08:00   1m10.53489046s  hash_agg_exec_partial_worker_3  85064   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.912912816s hash_agg_exec_partial_worker_3  85064   event:update partial result
2018-08-03T13:49:21+08:00   1m47.912924316s hash_agg_exec_partial_worker_3  85064   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.862103088s hash_agg_exec_partial_worker_3  85064   event:update partial result
2018-08-03T13:49:44+08:00   2m10.862105788s hash_agg_exec_partial_worker_3  85064   event:iterating chunk's data
2018-08-03T13:50:11+08:00   2m37.303507169s hash_agg_exec_partial_worker_3  85064   event:update partial result
2018-08-03T13:50:11+08:00   2m37.30353717s  hash_agg_exec_partial_worker_3  85064   event:iterating chunk's data
2018-08-03T13:50:42+08:00   3m8.294274977s  hash_agg_exec_partial_worker_3  85064   event:update partial result
2018-08-03T13:50:42+08:00   3m8.294277578s  hash_agg_exec_partial_worker_3  85064   event:iterating chunk's data
2018-08-03T13:51:06+08:00   3m32.319172615s hash_agg_exec_partial_worker_3  85064   event:update partial result
2018-08-03T13:51:06+08:00   3m32.319179415s hash_agg_exec_partial_worker_3  85064   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.259518628s hash_agg_exec_partial_worker_3  85064   event:update partial result
2018-08-03T13:51:14+08:00   3m41.259521128s hash_agg_exec_partial_worker_3  85064   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.261015585s hash_agg_exec_partial_worker_3  85064   event:shuffling interm data
2018-08-03T13:48:27+08:00   54.211054975s   hash_agg_exec_partial_worker_15 85065   event:update partial result
2018-08-03T13:48:27+08:00   54.211057675s   hash_agg_exec_partial_worker_15 85065   event:iterating chunk's data
2018-08-03T13:48:53+08:00   1m19.883554028s hash_agg_exec_partial_worker_15 85065   event:update partial result
2018-08-03T13:48:53+08:00   1m19.883557428s hash_agg_exec_partial_worker_15 85065   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.94199282s  hash_agg_exec_partial_worker_15 85065   event:update partial result
2018-08-03T13:49:21+08:00   1m47.94199592s  hash_agg_exec_partial_worker_15 85065   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.897929047s hash_agg_exec_partial_worker_15 85065   event:update partial result
2018-08-03T13:49:44+08:00   2m10.897931347s hash_agg_exec_partial_worker_15 85065   event:iterating chunk's data
2018-08-03T13:50:24+08:00   2m50.980374675s hash_agg_exec_partial_worker_15 85065   event:update partial result
2018-08-03T13:50:24+08:00   2m50.980378575s hash_agg_exec_partial_worker_15 85065   event:iterating chunk's data
2018-08-03T13:50:48+08:00   3m15.086468381s hash_agg_exec_partial_worker_15 85065   event:update partial result
2018-08-03T13:50:48+08:00   3m15.086474682s hash_agg_exec_partial_worker_15 85065   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.231004347s hash_agg_exec_partial_worker_15 85065   event:update partial result
2018-08-03T13:51:14+08:00   3m41.231006647s hash_agg_exec_partial_worker_15 85065   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260862279s hash_agg_exec_partial_worker_15 85065   event:update partial result
2018-08-03T13:51:14+08:00   3m41.260863479s hash_agg_exec_partial_worker_15 85065   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.261053087s hash_agg_exec_partial_worker_15 85065   event:shuffling interm data
2018-08-03T13:48:27+08:00   54.205276756s   hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:48:27+08:00   54.205279756s   hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:48:53+08:00   1m19.877003479s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:48:53+08:00   1m19.87701708s  hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:49:21+08:00   1m47.938613291s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:49:21+08:00   1m47.938621592s hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:49:44+08:00   2m10.896386088s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:49:44+08:00   2m10.896387988s hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:50:24+08:00   2m50.978541905s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:50:24+08:00   2m50.978556106s hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:50:48+08:00   3m15.081525194s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:50:48+08:00   3m15.081527594s hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.229491689s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:51:14+08:00   3m41.229494289s hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260815277s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:51:14+08:00   3m41.260816578s hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.261040186s hash_agg_exec_partial_worker_14 85066   event:shuffling interm data
2018-08-03T13:47:33+08:00   109.004µs   hash_agg_exec_final_worker_14   85067   event:consuming interm data
2018-08-03T13:47:33+08:00   86.704µs    hash_agg_exec_final_worker_5    85068   event:consuming interm data
2018-08-03T13:47:33+08:00   118.604µs   hash_agg_exec_final_worker_9    85069   event:consuming interm data

query 8 takes roughly 210 seconds which is just 3 minutes and 52 seconds. From the trace result we can tell the bottleneck of this query is the following:

2018-08-03T13:50:48+08:00   3m15.081525194s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:50:48+08:00   3m15.081527594s hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.229491689s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:51:14+08:00   3m41.229494289s hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.260815277s hash_agg_exec_partial_worker_14 85066   event:update partial result
2018-08-03T13:51:14+08:00   3m41.260816578s hash_agg_exec_partial_worker_14 85066   event:iterating chunk's data
2018-08-03T13:51:14+08:00   3m41.261040186s hash_agg_exec_partial_worker_14 85066   event:shuffling interm data
2018-08-03T13:47:33+08:00   109.004µs   hash_agg_exec_final_worker_14   85067   event:consuming interm data
2018-08-03T13:47:33+08:00   86.704µs    hash_agg_exec_final_worker_5    85068   event:consuming interm data
2018-08-03T13:47:33+08:00   118.604µs   hash_agg_exec_final_worker_9    85069   event:consuming interm data

I suspect our current hash aggregation design has some limitations, but this reamin questionable. We need a real harmer regarding of this issue. Next step is using go tool pprof to monitor cpu usage to verify our guess.

zhexuany commented 6 years ago

After using go-torch, we get a flame graph to verify our guess. Here, attached a link where you can download such flame graph. HashAggExec is not the bottleneck. The reason could be the implementation of trace in HashAggExec is not correct.

shenli commented 6 years ago

@zhexuany Go Pprof is a more convenient tool for profiling. Would you please post the pprof result here?

zz-jason commented 6 years ago

@zhexuany The tracing result of hash_agg_exec_partial_worker is too large.

Seems like the tracing result is affected by the input data size.

zhexuany commented 6 years ago

cc @zz-jason After some research, the reason for hash_agg_exec_partial_worker being too large could be we did not add trace for HashLeftJoin.

HashLeftJoin is a child of HashAgg. I am working on adding trace for HashLeftJoin. I will ping you once I get the new result.

I also attached the explain result in the following:

digraph Sort_28 {
subgraph cluster28{
node [style=filled, color=lightgrey]
color=black
label = "root"
"Sort_28" -> "Projection_30"
"Projection_30" -> "HashAgg_33"
"HashAgg_33" -> "Projection_34"
"Projection_34" -> "HashLeftJoin_39"
"HashLeftJoin_39" -> "HashLeftJoin_44"
"HashLeftJoin_39" -> "TableReader_103"
"HashLeftJoin_44" -> "IndexJoin_48"
"HashLeftJoin_44" -> "TableReader_100"
"IndexJoin_48" -> "HashLeftJoin_54"
"IndexJoin_48" -> "TableReader_47"
"HashLeftJoin_54" -> "HashLeftJoin_59"
"HashLeftJoin_54" -> "TableReader_96"
"HashLeftJoin_59" -> "HashLeftJoin_65"
"HashLeftJoin_59" -> "TableReader_94"
"HashLeftJoin_65" -> "IndexJoin_71"
"HashLeftJoin_65" -> "TableReader_92"
"IndexJoin_71" -> "TableReader_87"
"IndexJoin_71" -> "IndexLookUp_70"
}
subgraph cluster102{
node [style=filled, color=lightgrey]
color=black
label = "cop"
"Selection_102" -> "TableScan_101"
}
subgraph cluster99{
node [style=filled, color=lightgrey]
color=black
label = "cop"
"TableScan_99"
}
subgraph cluster46{
node [style=filled, color=lightgrey]
color=black
label = "cop"
"TableScan_46"
}
subgraph cluster95{
node [style=filled, color=lightgrey]
color=black
label = "cop"
"TableScan_95"
}
subgraph cluster93{
node [style=filled, color=lightgrey]
color=black
label = "cop"
"TableScan_93"
}
subgraph cluster91{
node [style=filled, color=lightgrey]
color=black
label = "cop"
"Selection_91" -> "TableScan_90"
}
subgraph cluster86{
node [style=filled, color=lightgrey]
color=black
label = "cop"
"Selection_86" -> "TableScan_85"
}
subgraph cluster69{
node [style=filled, color=lightgrey]
color=black
label = "cop"
"TableScan_69"
}
subgraph cluster68{
node [style=filled, color=lightgrey]
color=black
label = "cop"
"IndexScan_68"
}
"TableReader_103" -> "Selection_102"
"TableReader_100" -> "TableScan_99"
"TableReader_47" -> "TableScan_46"
"TableReader_96" -> "TableScan_95"
"TableReader_94" -> "TableScan_93"
"TableReader_92" -> "Selection_91"
"TableReader_87" -> "Selection_86"
"IndexLookUp_70" -> "TableScan_69"
"IndexLookUp_70" -> "IndexScan_68"
}