Closed 1beb closed 2 years ago
@cboettig this is ready for another round of review. Note my question about unark
. I'm not sure how meaningful (or even desirable) it would be.
:eyes: nice, looking promising here! just ping me when you're ready for a review!
@cboettig
I accidentally included some of the filter injection on this one. Quick question, do you agree with the filter injection filter (as a concept), and if so, can I combine these two into a single PR (filter + parquet)?
yup, I noticed the injection filter was here too. I agree that in principle it's something we should be handling, at least as an option, so I'm happy to have it wrapped in to the same PR.
@cboettig ok, we're ready for a review.
:rocket: nice work, all looks good here.
TODOs:
Questions:
unark()
functionality required?chunk_size
to write_parquet for extremely large tables? Answer: don't. letlines
param dictate output chunk size. Answer: We don't. Chunks must be written manually.Outline:
streamable_parquet
which imports the arrow functions following the other external packagescon
object for this purpose.parquet
files would never be written without header information, made adjustment to keep_open to accomodate this.