While writing this is was evident that writing generated protobufs even with list comprehensions
were going to be nigh near impossible to read. I introduced the start of a substrait_builder class
which allows for an easier way of constructing relations and generally allows for composability of
sections using functions. It doesn't implement all functions, has some repetition that could be
eliminated (for instance how many methods do we need that take two expressions as arguments?),
and could be even cleaner if every function usage didn't need to pass in the function information.
However addressing those issues are beyond the scope of what we were trying to accomplish here.
The builder really belongs in substrait-python and doing so will require refactoring how functions
are converted from spark names into Substrait. After that the other code already in
spark_to_substrait.py can be updated to use the builder which will also make it more readable.
While writing this is was evident that writing generated protobufs even with list comprehensions were going to be nigh near impossible to read. I introduced the start of a substrait_builder class which allows for an easier way of constructing relations and generally allows for composability of sections using functions. It doesn't implement all functions, has some repetition that could be eliminated (for instance how many methods do we need that take two expressions as arguments?), and could be even cleaner if every function usage didn't need to pass in the function information. However addressing those issues are beyond the scope of what we were trying to accomplish here. The builder really belongs in substrait-python and doing so will require refactoring how functions are converted from spark names into Substrait. After that the other code already in spark_to_substrait.py can be updated to use the builder which will also make it more readable.