apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.41k stars 3.68k forks source link

query slow about near real time data #9918

Open cxlRay opened 4 years ago

cxlRay commented 4 years ago

when send batch query to Realtime processes tasks, the performance is too bad. TPS only have 80, response time is more than one second, max response time can be 30 second and 99% response time can be 15 second. what I confused is the data of Realtime processes are in mem, why response time of query is so long?

Affected Version

v druid-0.16.1-incubating.

Description

yuanlihan commented 4 years ago

Hi @cxlRay, You can try to minimise intermediatePersistPeriod which is PT10M by default and then the persisted data could apply the "vectorize": "true" option. Hope this helps to some extent.

exherb commented 4 years ago

some issue here.

cxlRay commented 4 years ago

@yuanlihan thanks for your answer, minimise intermediatePersistPeriod will create much more small file

yuanlihan commented 4 years ago

@yuanlihan thanks for your answer, minimise intermediatePersistPeriod will create much more small file

@cxlRay that's true. But the temporary persist files will be cleaned up when the hourly tasks finished. And the persisted incremental files/indexes(with extra indexes to speed up query processing) will be more efficient than the in-memory incrementalIndex. As far as I know, when a query scan the latest in-memory incrementalIndex(like latest 10 minutes's data), Druid processes the in-memory fact table held by incrementalIndex row by row. Actually, I had tried to minimise the value of maxRowsInMemory to reduce in-memory rows but this will also introduce some overheads caused by frequently persisting.

exherb commented 4 years ago

timeseries is very slow with filter. (10-20s)

cxlRay commented 4 years ago

@exherb say more about your question, if your cluster run on ssd, there will be better

exherb commented 4 years ago

@exherb say more about your question, if your cluster run on ssd, there will be better

cxlRay commented 4 years ago

just like yuanlihan say,minimise intermediatePersistPeriod or maxRowsInMemory , add query parmeter vectorize

exherb commented 4 years ago

just like yuanlihan say,minimise intermediatePersistPeriod or maxRowsInMemory , add query parmeter vectorize

intermediatePersistPeriod: P10M maxRowsInMemory: 1000000

context: { vectorize: true }

still slow.

navis commented 4 years ago

rows in no-rollup incremental index are stored as

ConcurrentSkipListMap<Long, ConcurrentLinkedDeque<IncrementalIndexRow>>

which seemed not easy to be fast, imho.

Anyway, can you try with coarser granularity, like 15 minutes or something?

exherb commented 4 years ago

rows in no-rollup incremental index are stored as

ConcurrentSkipListMap<Long, ConcurrentLinkedDeque<IncrementalIndexRow>>

which seemed not easy to be fast, imho.

Anyway, can you try with coarser granularity, like 15 minutes or something?

we enabled rollup to 1minutes, segment granularity by hour. Are you suggestion change segment granularity to 15 minutes?

cxlRay commented 4 years ago

@exherb how do you know the slow part is peon query,the query process: client -> broker -> indexing service(Peon), use monitor metric or analysis code?

exherb commented 4 years ago

@exherb how do you know the slow part is peon query,the query process: client -> broker -> indexing service(Peon), use monitor metric or analysis code?

by druid metrics: query/segment/time / query/time (datasource~=.+middlemanager.+)

navis commented 4 years ago

@exherb I mean granularity of timeseries query.

louisliu318 commented 4 years ago

same issue.