Open pedroerp opened 1 week ago
This pull request was exported from Phabricator. Differential Revision: D66169184
Name | Link |
---|---|
Latest commit | 82021a20e3fc7cf4e36f49b9be0367329ba4301e |
Latest deploy log | https://app.netlify.com/sites/meta-velox/deploys/673f79c04844a80008a94b61 |
This pull request was exported from Phabricator. Differential Revision: D66169184
@pedroerp I have verified that we have this case in the following eight locations. Do we need to implement this check in all eight places?
This pull request was exported from Phabricator. Differential Revision: D66169184
@Yuhta I factored out some helper methods to make it a bit more readable. Please take another look.
@pedroerp I have verified that we have this case in the following eight locations. Do we need to implement this check in all eight places?
@JkSelf thanks for helping checking this. This is a bit tricky, but it's only needed when you are advancing the left side. The problem is because you can't advance the left while you are wrapping around a previous left buffer, since the previous left buffer could be wrapping around a lazy vector not yet materialized. So if you advance, you will materialize the keys from the next left batch, and later on another operator may try to materialize the previous left batch, and violates the constraint that lazy vectors need to be materialized in order.
For right-hand side buffers, we don't support lazy vectors across pipelines, so when the driver feeds the right batches it guarantees to materialize them. For the other left call sites, I believe I covered all of them in this change.
@pedroerp I have verified that we have this case in the following eight locations. Do we need to implement this check in all eight places?
@JkSelf thanks for helping checking this. This is a bit tricky, but it's only needed when you are advancing the left side. The problem is because you can't advance the left while you are wrapping around a previous left buffer, since the previous left buffer could be wrapping around a lazy vector not yet materialized. So if you advance, you will materialize the keys from the next left batch, and later on another operator may try to materialize the previous left batch, and violates the constraint that lazy vectors need to be materialized in order.
For right-hand side buffers, we don't support lazy vectors across pipelines, so when the driver feeds the right batches it guarantees to materialize them. For the other left call sites, I believe I covered all of them in this change.
@pedroerp I see. Thanks for your detailed explanations.
Summary: Before we start reading keys from the next batch of input, we need to make sure we are not holding output_ wrapped around lazy vector from the last batch, since lazy vectors need to be materialized in order.
Differential Revision: D66169184