Open asfimport opened 4 years ago
Todd Farmer / @toddfarmer: This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per project policy. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.
Currently, Arrow has no textual representation for its schema that could serve the same purposes as JSON-Schema for JSON, the .proto files for Protobuf, etc. This issue is about adding such a text representation for an Arrow schema, to fill the same use cases that these textual representations fill for other data serialization formats.
The requirements for a text schema representation:
Not tied to a particular version of Arrow & compatible between Arrow versions
And from a software engineering point of view, it would be very desirable for the implementation to not add another library dependency for Arrow (which already has many).
After discussion on the mailing list, the JSON representation for Flatbuffers data seemed the best candidate. It is a format supported by the Flatbuffers projects for serializing Flatbuffers assets in a human-readable format, for inclusion under source-control. And there is already functionality in Arrow to convert Schema objects to a Flatbuffers representation. This would meet all the requirements above, while requiring only a small amount of new Arrow code to implement.
This issue will add functions Arrow to load and save a textual, JSON representation of an Arrow schema, by first converting it to a FlatBuffers object, and then using the Flatbuffers functionality to save/load such objects as JSON.
Reporter: Christian Hudon / @chrish42
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-8952. Please see the migration documentation for further details.