linkedin / transport

A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
BSD 2-Clause "Simplified" License
291 stars 72 forks source link

Add support for getting field names #140

Closed phd3 closed 1 year ago

phd3 commented 1 year ago

Adds a fieldNames method to StdStructType to return list of field names without having to operate on data.

It doesn't seem to be currently used as the names can also be accessible while working with actual data records. e.g. through implementations of StdStruct#getField with index or name.

However, our usecase is to be able to do semantic validations of a JSON-like path representing a field in a dataset schema, using some sort of type representation, and this is one of the things needed for picking transport's type system as that type representation. cc @wmoustafa @khaitranq

phd3 commented 1 year ago

Thanks for the review @khaitranq .

@wmoustafa @ljfgem would you be able to give this a review?

phd3 commented 1 year ago

Thanks for the reviews! I've squashed the fixup commit.