Closed anshuldata closed 1 month ago
ACTION NEEDED
Substrait follows the Conventional Commits specification for release automation.
The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.
Patch makes sense. We should update the description message in this PR to also explain the deprecation.
I'm not a fan of Protobuf quirks becoming load-bearing, but this seems to indicate a field can be replaced with a oneof while maintaining binary compatibility. 😅
I believe that only works for optional/required fields. I believe repeated can be in a oneof because of how they are serialized.
We should update the description message in this PR to also explain the deprecation. Updated PR description to indicate VirutalTable is deprecated
I feel like the main outstanding question is whether we create new table type versus adding field to existing and deprecating old fields. @westonpace, @vbarua and @EpsilonPrime, thoughts? It's subjective but I'm definitely a fan of keeping the two table types separate so nobody can make a mistake and set both fields.
My opinion hasn't changed. I'll approve either approach. If I need to take a side I still prefer one relation. We have, several times now, deprecated fields and not worried too much about users setting both values.
I think confusion about "why are there two relations that seem to do the same thing" is going to be more confusing than "why are there two fields, one which is marked deprecated, and one that isn't".
I'm fine with @westonpace's preferred change. @anshuldata can you update to single message with two sets of repeated fields?
If I need to take a side I still prefer one relation.
Fixed as suggested. Basically to add another field in VirtualTable instead of another Relation type
@anshuldata Hi! Trying to adapt the changes in datafusion. It's not clear how to handle multiple rows in values, eq SELECT * FROM VALUES (1+1, 2), (2+2, 3)
.
Should it be wrapped into Expression.Nested.List
or Expression.Nested.Struct
, or, maybe, there is another preferred way to handle it?
@akoshchiy expressions
is a repeated field. You can set it to a list of Expression
messages with ~Expression.Literal.Struct
~ Expression.Nested.Struct
set within (one Expression per row). ~It will be basically the same as the current behavior with deprecated values
field with the only difference being that you need to wrap Expression.Literal.Struct
messages in Expression
messages now.~
upd: my bad, never mind. didn't notice the pluses. I'm guessing it should be Expression.Nested.Struct
messages within.
I think this is a really good question actually. with the values
field, I'm assuming a single Literal.Struct
value would be a single row. If that approach stays the same with the new expressions
field, I think it only ever makes sense for the Expression to hold either Literal.Struct
or Nested.Struct
, it should never be a scalar as it's supposed to represent a row, not an individual value.
Wouldn't it make have made more sense in the first place for the new field to have been repeated Expression.Nested.Struct
type instead of repeated Expression
? Nested.Struct can contain all literals as well, after all.
Wouldn't it make have made more sense in the first place for the new field to have been repeated Expression.Nested.Struct type instead of repeated Expression
Yes, agree. I missed this in my original change. I will raise a PR to fix it CC: @akoshchiy
select * from (values (1+2, 'Hello'||'World'))