Open paul-rogers opened 2 years ago
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.
Affected Version
Latest master branch as of Halloween, 2021.
Description
The
BaseQuery
class supports a context flag calledbySegment
. When set totrue
, it seems to gather all the results from each segment into a single object. (SeeBySegmentQueryRunner
.) The query "planner" (seeServerManager.buildAndDecorateQueryRunner()
) always adds theBySegmentQueryRunner
for all queries. If thebySegment
flag is set, the magic happens.Unfortunately, there is a downstream runner (meaning higher in the call stack), which expects the result type to be
ScanResultValue
, but the by-segment runner has changed the type. (The code in that runner even warns that it will do so, and that the type signature after by-segment is a lie.)Specifically, since the query is limited, the "planner" inserts a query runner which creates a
ScanQueryLimitRowIterator
, which expects the type to still beScanResultValue
. Since it is not, we get aClassCastException
and the query fails with a 504 error back to the client.Note that the exception is quietly consumed, It is not clear if it is reported to the Python client I'm using. Even if it was, not many users might know what
ClassCastException
means or what caused it.More frustratingly, while the error it not logged in the debug console in Eclipse, the stack trace is not. So, it takes a bit of work to figure out what's wrong.
Suggestion
Four possible solutions:
bySegment
flag for aScanQuery
with a limit.ScanQueryLimitRowIterator
to know about theResult
type introduced byBySegmentQueryRunner
, and to apply limits to that.Result
already exists, have the planner insert that instead of theScanQueryLimitRowIterator
if thebySegment
flag is set.Example
Here is the query sent to Druid, in the form used by
pydruid
(that is, as a Python map):Remove the
context
item and the query runs, retain it and the exception occurs. The filter can be dropped also, it is irrelevant to this particular issue.Other Notes
Looking at the code, it is not clear that by-segment is compatible with the scan query. The scan query produces batches of results. By segment combines these batches into a
Result
. It would seem the proper solution would be to first expand all the scan batches into a single big list, then combine them into theResult
so thatResult
contains a list of rows, not ofScanResultValue
(each of which contains a list of rows.)Note also that the current implementation does not scale: it materializes the entire segment results into one
Result
object. If the segment contains 5M rows, then that's the size of this one object. It would seem that the operator should produceResult
s up to some maximum size, and produce a sequence ofResult
s once that maximum size is reached. For example: allow, say, 1000 values perResult
. Of course, the consumer of these batched results would have to understand what has happened.Ignoring the
bySegment
requires a query rewrite to remove it, else theFinalizeResultsQueryRunner
expects the final results to be inbySegment
and will fail if they are not.