Closed sadikovi closed 6 years ago
@sunchao Could you review this PR and let me know if we should keep the change or postpone it for now? This is one of the (optional?) items on write-support issue. Thanks!
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
file/writer.rs | 7 | 95.93% | ||
file/reader.rs | 14 | 97.08% | ||
schema/printer.rs | 20 | 70.52% | ||
file/properties.rs | 23 | 92.56% | ||
record/reader.rs | 83 | 88.61% | ||
<!-- | Total: | 147 | --> |
Totals | |
---|---|
Change from base Build 619: | 0.04% |
Covered Lines: | 12460 |
Relevant Lines: | 13035 |
Thanks @sadikovi ! Just curious: do you have in mind what other reader properties we will add in future? I checked parquet-cpp/parquet-mr and didn't find many properties that people need to tune on the reader side.
Actually, not. The only setting that we have and that one is used in record reader (which could be deprecated in the future) is batch size.
What I can do is simply bumping up the batch size to 1024/4096 instead of introducing the whole reader properties and changing API.
Let me know what you think.
What I can do is simply bumping up the batch size to 1024/4096 instead of introducing the whole reader properties and changing API.
Yes I'd prefer this approach. Thanks!
This PR adds ReaderProperties struct, similar to WriterProperties used for writing. Currently this struct only contains batch_size, which is used to be hard-coded.
Following is done:
This is a fairly big change and has implications of incompatibility of API with prior versions.