Closed my-vegetable-has-exploded closed 1 month ago
Thanks for creating this issue. The concept of aliases has come up in community discussions in the past but I can't find a good concrete issue for this idea.
I think it would be a good idea to add aliases as an "optional extension" or an "official hint" of sorts. We have also talked in the past about having some way to annotate parts of the plan with extra information.
My concern with your proposed AliasRel
is that it will cause the plan to become unreadable by engines that do not understand it. Or, to put it another way, it forces all consumers to handle aliases whether they care about them or not.
I think something using AdvancedExtension
might be an easier path forward. In particular, the enhancement
type of advanced extension specifically says that the information can be ignored by consumers if they choose to do so. It's also something that could be done without any change to the Substrait spec.
I'm also going to CC @jacques-n who has talked about "annotations" before and I think was very close to writing up some kind of proposal / idea doc at one point in time.
Thanks for your patience!
I think it would be a good idea to add aliases as an "optional extension" or an "official hint" of sorts. We have also talked in the past about having some way to annotate parts of the plan with extra information.
That's really a better idea.
I think something using
AdvancedExtension
might be an easier path forward. In particular, theenhancement
type of advanced extension specifically says that the information can be ignored by consumers if they choose to do so. It's also something that could be done without any change to the Substrait spec.
Thank you for your advice. Do you mean that we can put the alias name into AdvancedExtension? By the way, the comment shows that enhancement
cannot be ignored by a consumer which iis a little different from your description.
// An enhancement alter semantics. Cannot be ignored by a consumer.
google.protobuf.Any enhancement = 2;
Do you mean that we can put the alias name into AdvancedExtension?
Yes
By the way, the comment shows that enhancement cannot be ignored by a consumer which iis a little different from your description.
Good catch, you are right, I got it backwards. optimization
is the one to use.
Since substrait always use field indices to refer to fields,it may cause some problem when revert from substrait to logical plan.
Taking datafusion-6867 for example, datafusion use table+field to distinguish fields.
Without alias for rel, it can't distinguish between left input and right input in self join.
I think alias for rel is not complicated to implement and improves ease of use.
A simple design is given below.