Open kwannoel opened 1 year ago
Please append to screenshots if you find something interesting in the flamegraph.
https://github.com/risingwavelabs/risingwave/blob/main/src/common/src/array/mod.rs#L500-L514
append_datum_n
itself takes a non-negligible amount of time, does it make sense?
let's save the to_vec
.
Edit: Done in #8732
https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/parser/json_parser.rs#L122
done to_ascii_lowercase()
once in advance, out of the closure/loop?
Edit: Done in #8718
https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/parser/mod.rs#L200
Vec::with_capacity()
instead of vec![]
?
Edit: Done in #8718
I feel we need to find a good strategy to decide when to compact
in the ProjectExecutor
https://github.com/risingwavelabs/risingwave/blob/main/src/stream/src/executor/project.rs#L113
In this case, as the expr
is really computation-light, compact
itself introduces 11% overhead.
Edit:
Probably not, after all, we need to output those visible ones only to an external sink, so we have to do some compaction
somewhere before the final stage.
Probably not, after all, we need to output those visible ones only to an external sink, so we have to do some
compaction
somewhere before the final stage.
Not sure I understand why compaction
is required in this case. Why can't we just output visible rows to external sink?
https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/sink/remote.rs#L276-L294
Kind of depending on the output format requirement
if it asks for JSON, then we iterate through each row of the chunk, so we can choose not to compact. But if it asks for stream chunk format, then we have to compact.
Since this is a blackhole sink, we can save the compaction, you are right!
So whether to compact becomes something to determine at the stage of optimization?
https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/sink/remote.rs#L276-L294
Kind of depending on the output format requirement
if it asks for JSON, then we iterate through each row of the chunk, so we can choose not to compact. But if it asks for stream chunk format, then we have to compact.
Since this is a blackhole sink, we can save the compaction, you are right!
Hmm even in StreamChunk
we can defer it to to_protobuf
? https://github.com/risingwavelabs/risingwave/blob/96aa23dd0dc04c869a2ea5fdb4b7b1fc8fbcbf6d/src/connector/src/sink/remote.rs#L290-L294
I think so, but this prost_stream_chunk
seems not aware of visibility
But typically will any system consume StreamChunk
even? 👀
I thought it is used internally
But typically will any system consume StreamChunk even? 👀
I guess no, so we can save the final compact if it is connected to the sink/MV executor
For this particular query, we need to be cautious because of this two project
issue.
It's a bug that has not been fixed.
The project below actually compacts in this case, so
save the final compact
is a wrong statement here, which is a correct one if we fixed this particular bug first.
So whether to compact becomes something to determine at the stage of optimization?
Makes sense to me.
For this particular query, we need to be cautious because of this two
project
issue. It's a bug that has not been fixed.The project below actually compacts in this case, so
save the final compact
is a wrong statement here, which is a correct one if we fixed this particular bug first.
Linking it: https://github.com/risingwavelabs/risingwave/issues/8577
https://github.com/risingwavelabs/risingwave/issues/8577#issuecomment-1471343129
Which means we actually do no need to have StreamRowIdGen { row_id_index: 5 }
right? @st1page
ok, not much overhead though from flamegraph
Guess SourceStreamChunkRowWriter
is not efficient enough for those insert-only sources 🤔
I feel we need to find a good strategy to decide when to compact in the ProjectExecutor https://github.com/risingwavelabs/risingwave/blob/main/src/stream/src/executor/project.rs#L113 In this case, as the expr is really computation-light, compact itself introduces 11% overhead.
done in https://github.com/risingwavelabs/risingwave/pull/8758
But typically will any system consume StreamChunk even? eyes
I am not sure if we can sink the chunk into a system with arrow format :thinking:
For this particular query, we need to be cautious because of this two project issue. It's a bug that has not been fixed. The project below actually compacts in this case, so
save the final compact
is a wrong statement here, which is a correct one if we fixed this particular bug first.
I think it can not help the compact performance issue because it will be compact in any project. If we have a plan ProjA->ProjB
, the chunk will be compacted in the projA
and the the ProjB::compact()
will not be costly
I think the idea I have is slightly different, it's more to avoid compact even when there's filter, as in the case of q0.
When sinking, we don't need to build new chunk, instead we build protobuf / json encoded chunk. We can delay the top-most compact call until here, saving cost of building a new chunk, and relying on protobuf / json building step to remove invisible rows.
@st1page 's approach is still needed, as a general optimization for when to compact
. This approach is complementary, it will always eliminate top-most compact when sinking regardless of selectivity, to avoid unnecessarily building a new chunk.
q0's compact call can be optimized via this complementary approach.
To implement this as a optimization requires a bit of refactoring to add should_compact
as a plan-level attribute (or other suggestions?). A simple thing to do for now is just disable compact for top-most project, if sinking, since that's usually the most common case.
When sinking, we don't need to build new chunk, instead we build protobuf / json encoded chunk. We can delay the top-most compact call until here, saving cost of building a new chunk, and relying on protobuf / json building step to remove invisible rows.
strongly +1.
To implement this as a optimization requires a bit of refactoring to add should_compact as a plan-level attribute (or other suggestions?). A simple thing to do for now is just disable compact for top-most project, if sinking, since that's usually the most common case.
I think is not a plan-level attribute. Currently, we do compact the input chunk just to simplify the executor's implementation. But in fact, every executor should handle the visibility properly. e.g.
SinkImpl
as you say.The question here is that we give the chunk's visibility to SinkExecutor which means it has the chance to do the optimization but it does not. So we need to do optimization in the SinkExecutor.
The peak throughput gets to 1M rows/s. Luckily, the imbalanced source throughput problem didn't happen today.
I guess the left two things are:
SourceStreamChunkRowWriter
, or let's say have a customized code path for insert-only sources. 🤔 At least it can avoid the data type match for every Datum
I suppose.link #14815
Background
In recent benchmark Flink had average throughput of 1M r/s. RW had average throughput of 850K r/s. Requires a 17% improvement to match
Flink
. Thanks to @huangjw806 for spotting this.Flamegraph can be found here, under
Artifacts
.query
plan
Here are screenshots of the flamegraph, highlighting cost centers.