apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.27k stars 1.18k forks source link

`cargo build --no-default-features` does not build cleanly #8844

Open alamb opened 10 months ago

alamb commented 10 months ago

Describe the bug

Build fails with no default features

To Reproduce

cargo build --no-default-features

Produces

error[E0412]: cannot find type `ParquetSink` in this scope
   --> datafusion/proto/src/physical_plan/mod.rs:989:32
    |
989 |                 let data_sink: ParquetSink = sink
    |                                ^^^^^^^^^^^ not found in this scope
    |
help: consider importing one of these items
    |
18  + use crate::protobuf::ParquetSink;
    |
18  + use datafusion::datasource::file_format::parquet::ParquetSink;
    |

error[E0412]: cannot find type `ParquetSink` in this scope
    --> datafusion/proto/src/physical_plan/mod.rs:1822:69
     |
1822 |             if let Some(sink) = exec.sink().as_any().downcast_ref::<ParquetSink>() {
     |                                                                     ^^^^^^^^^^^ not found in this scope
     |
help: consider importing one of these items
     |
18   + use crate::protobuf::ParquetSink;
     |
18   + use datafusion::datasource::file_format::parquet::ParquetSink;
     |

warning: unused import: `writer_properties_to_proto`
  --> datafusion/proto/src/physical_plan/to_proto.rs:34:56
   |
34 | use crate::logical_plan::{csv_writer_options_to_proto, writer_properties_to_proto};
   |                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: `#[warn(unused_imports)]` on by default

warning: unused import: `parquet_writer::ParquetWriterOptions`
  --> datafusion/proto/src/physical_plan/to_proto.rs:65:9
   |
65 |         parquet_writer::ParquetWriterOptions,
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For more information about this error, try `rustc --explain E0412`.
warning: `datafusion-proto` (lib) generated 2 warnings
error: could not compile `datafusion-proto` (lib) due to 2 previous errors; 2 warnings emitted
warning: build failed, waiting for other jobs to finish...

Expected behavior

Should build cleanly

Additional context

Reported by @fudini here: https://github.com/apache/arrow-datafusion/issues/7653#issuecomment-1888113919

kmitchener commented 9 months ago

I'm working on a solution to this -- it's caused by workspace feature resolution. If any crate in the workspace includes datafusion or datafusion-common with default features, it causes cargo build --no-default-features from the workspace root to build datafusion-common with parquet support which breaks datafusion-proto during a workspace build with this strange error:

   Compiling datafusion-proto v35.0.0 (/home/kmitchener/dev/arrow-datafusion/datafusion/proto)
error[E0004]: non-exhaustive patterns: `&datafusion_common::FileTypeWriterOptions::Parquet(_)` not covered
   --> datafusion/proto/src/physical_plan/to_proto.rs:904:31
    |
904 |         let file_type = match opts {
    |                               ^^^^ pattern `&datafusion_common::FileTypeWriterOptions::Parquet(_)` not covered
    |
note: `datafusion_common::FileTypeWriterOptions` defined here
   --> /home/kmitchener/dev/arrow-datafusion/datafusion/common/src/file_options/mod.rs:150:1
    |
150 | pub enum FileTypeWriterOptions {
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
151 |     #[cfg(feature = "parquet")]
152 |     Parquet(ParquetWriterOptions),
    |     ------- not covered
    = note: the matched value is of type `&datafusion_common::FileTypeWriterOptions`
help: ensure that all possible cases are being handled by adding a match arm with a wildcard pattern or an explicit pattern as shown
    |
934 ~             },
935 +             &datafusion_common::FileTypeWriterOptions::Parquet(_) => todo!()
    |

For more information about this error, try `rustc --explain E0004`.
error: could not compile `datafusion-proto` (lib) due to previous error

The benchmark crate includes datafusion and datafusion-common with default features, since it needs the parquet support. I think the only way to resolve this is to exclude the benchmark crate from the workspace, similar to how datafusion-cli is excluded. The other crates should be ok -- substrait may need some additional #[cfg(feature = "parquet")] flags set to work properly, similar to flags added to datafusion-proto

alamb commented 9 months ago

I think the only way to resolve this is to exclude the benchmark crate from the workspace, similar to how datafusion-cli is excluded.

I think that would be ok

An alternate might be to make the benchmarks conditionalized on parquet support too (don't try to compile them without parquet support) 🤔