There's a close relationship between the work in SIG IO and ExampleGen implementations for TFX. Currently, TFX has no particular way of maintaining contributions, and there seems like there may be significant code overlap as well with stuff in IO.
I'm starting this thread to kick off a discussion about the feasibility and desirability of incorporating TFX data ingestion components into the SIG's work.
the TFX team has decided to merge the Parquet PR rather than having it maintained externally. So this particular issue isn't as urgent as it was before, it might come up again in future.
This issue is to track the progress of adding Parquet ExampleGen support, so that it could be integrated into TFX OSS.
Related issues: https://github.com/tensorflow/tfx/issues/74
Related discussion:
Related docs: https://github.com/tensorflow/tfx/blob/master/docs/guide/examplegen.md#custom-examplegen