Addresses some missing features and usability issues when using PyArrow with Substrait ExtendedExpressions
GitHub Issue: #41692
[x] Allow passing BoundExpressions for Scanner(columns=X) instead of a dict of expressions.
[x] Allow passing BoundExpressions for Scanner(filter=X) so that user doesn't have to distinguish between Expression and BoundExpressions and can always just use pyarrow.substrait.deserialize_expressions
[x] Allow decoding pyarrow.BoundExpressions directly from protobuf.Message, thus allowing to use substrait-python objects.
[x] Return memoryview from methods encoding substrait, so that those can be directly passed to substrait-python (or more in general other python libraries) without a copy being involved.
[x] Allow decoding messages from memoryview so that the output of encoding functions can be sent back to dencoding functions.
[x] Allow to encode and decode schemas from substrait
[x] When encoding schemas return the extension types required for a substrait consumer to decode the schema
[ ] Handle arrow extension types when decoding a schema
Addresses some missing features and usability issues when using PyArrow with Substrait ExtendedExpressions
GitHub Issue: #41692
[x] Allow passing
BoundExpressions
forScanner(columns=X)
instead of a dict of expressions.[x] Allow passing
BoundExpressions
forScanner(filter=X)
so that user doesn't have to distinguish betweenExpression
andBoundExpressions
and can always just usepyarrow.substrait.deserialize_expressions
[x] Allow decoding
pyarrow.BoundExpressions
directly fromprotobuf.Message
, thus allowing to use substrait-python objects.[x] Return
memoryview
from methods encoding substrait, so that those can be directly passed to substrait-python (or more in general other python libraries) without a copy being involved.[x] Allow decoding messages from
memoryview
so that the output of encoding functions can be sent back to dencoding functions.[x] Allow to encode and decode schemas from substrait
[x] When encoding schemas return the extension types required for a substrait consumer to decode the schema
[ ] Handle arrow extension types when decoding a schema
[ ] Update docstrings and documentation