apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.41k stars 3.51k forks source link

[Rust] [Parquet] Regression Can not implement custom ParquetWriter because `TryClone` is not publically exported #18339

Closed asfimport closed 3 years ago

asfimport commented 3 years ago

As of this commit

https://github.com/apache/arrow/commit/7155cd5488310c15d864428252ca71dd9ebd3b48

I don't think it is possible for a user of the arrow trait to implement a custom Parquet writer anymore. Specifically, theParquetWriter trait requires TryClone implemented, https://github.com/apache/arrow/blob/master/rust/parquet/src/file/writer.rs#L117-L118


pub trait ParquetWriter: Write + Seek + TryClone {}
impl<T: Write + Seek + TryClone> ParquetWriter for T {}

/// A serialized implementation for Parquet [`FileWriter`].
/// See documentation on file writer for more information.
pub struct SerializedFileWriter<W: ParquetWriter> {

but TryClone is can not be used. It is a pub trait:

https://github.com/apache/arrow/blob/master/rust/parquet/src/util/io.rs#L28-L32


/// TryClone tries to clone the type and should maintain the `Seek` position of the given
/// instance.
pub trait TryClone: Sized {
    /// Clones the type returning a new instance or an error if it's not possible
    /// to clone it.
    fn try_clone(&self) -> Result<Self>;
}

But the module it is (util.io) in is not marked as pub: https://github.com/apache/arrow/blob/master/rust/parquet/src/lib.rs#L39


 #[macro_use]
mod util;
#[cfg(any(feature = "arrow", test))]
pub mod arrow;
pub mod column;
pub mod compression;
mod encodings;
pub mod file;
pub mod record;
pub mod schema;
{code}

**Reporter**: [Andrew Lamb](https://issues.apache.org/jira/browse/ARROW-10390) / @alamb
**Assignee**: [Andrew Lamb](https://issues.apache.org/jira/browse/ARROW-10390) / @alamb
#### PRs and other links:
- [GitHub Pull Request #8528](https://github.com/apache/arrow/pull/8528)

<sub>**Note**: *This issue was originally created as [ARROW-10390](https://issues.apache.org/jira/browse/ARROW-10390). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>
asfimport commented 3 years ago

Andrew Lamb / @alamb: Here is what happens if you try and use TryClone:


-*- mode: compilation; default-directory: "~/Software/delorean/" -*-
Compilation started at Mon Oct 26 10:56:43

cd /Users/alamb/Software/delorean  && cargo clippy --all-targets --workspace -- -D warnings
    Blocking waiting for file lock on build directory
    Checking delorean_parquet v0.1.0 (/Users/alamb/Software/delorean/delorean_parquet)
    Checking delorean_storage v0.1.0 (/Users/alamb/Software/delorean/delorean_storage)
error[E0603]: module `util` is private
  --> delorean_parquet/src/writer.rs:11:5
   |
11 |     util::io::TryClone,
   |     ^^^^ private module
   |
note: the module `util` is defined here
  --> /Users/alamb/.cargo/git/checkouts/arrow-3a9cfebb6b7b2bdc/7155cd5/rust/parquet/src/lib.rs:39:1
   |
39 | mod util;
   | ^^^^^^^^^
asfimport commented 3 years ago

Neville Dipale / @nevi-me: Issue resolved by pull request 8528 https://github.com/apache/arrow/pull/8528