-
After addressing #28, we are ready to launch datafusion-ray tests in a cluster in CI/CD on a Kubernetes cluster
-
## Describe the bug
Looks like `target_chunk_size` config and actual size of the chunk might be different. I have following dataset
```yaml
- from: github:github.com/apache/datafusion/files/main
…
-
**Describe the bug**
when I run following sql, it reports `Error during planning: Unsupported operator in the subquery plan.`
**To Reproduce**
```
create table rectangle as select i id, (random(…
-
### Is your feature request related to a problem or challenge?
Sort Merge Join currently supports LeftSemi join however it would be nice to have RightSemi support as well
https://github.com/apac…
-
Implement some CLI binaries for working with ORC files such as reading schema, getting stats, etc.
Tools to have:
- View footer metadata
- Initial version: https://github.com/datafusion-contr…
-
It would be great to add sources from custom `TableProvider`. Also, leveraging sources in https://github.com/datafusion-contrib/datafusion-table-providers would allow to reuse the work done on datatyp…
-
### What is the problem the feature request solves?
Spark's `ColumnarToRowExec` can be very slow in some cases. This plan shows that C2R took 8 minutes even though the underlying scan only took 20 …
-
Hey, I'm trying to port a query from `duckdb` to `datafusion` using the Python connector, the issue I'm facing, is adding query parameters.
Is there currently a way that I can pass query parameters ?…
-
👋 I'm a core contributor to [GeoArrow](https://github.com/geoarrow/geoarrow), but I'm new to substrait and just reading through the docs.
I was surprised to see that a geospatial extension was alre…
-
**Is your feature request related to a problem? Please describe.**
I want to perform cross joins using daft
**Describe the solution you'd like**
`df1.join(df2, how='cross')`
**Describe alterna…