xitongsys / parquet-go

pure golang library for reading/writing parquet file
Apache License 2.0
1.27k stars 293 forks source link

Split ParquetFile into R/W interfaces #472

Closed zolstein closed 1 year ago

zolstein commented 2 years ago

Create ParquetFileR and ParquetFileW interfaces that represent files that can be read from or written to respectively. Replace usages of the ParquetFile interface with the relevant R/W interface. An equivalent of ParquetFile is left in place for backward compatibility.

Having a shared interface for read and written files adds little to this library, since no file ever needs to be both read and written to, and they have little overlap in their interfaces.

The combined interface complicates the contract. It must define interactions between read methods (e.g. Seek) and write methods (e.g. Write) even though such interactions are never used. (In theory, existing implementations should define the same interactions, but they probably don't.)

The combined interface also complicates writing new implementations of the interfaces. A read-only datasource can't "fully" implement the ParquetFile interface. A read/write data-source must care about the interactions described above. Splitting the interface resolves these issues. Users can write types that implement only the behavior that their datasource can support. They can write separate types to implement read and write behavior for the same datasource, to eliminate the possibility of interactions between read/write methods.