-
### LanceDB version
0.4.6
### What happened?
Hi, first of all many thanks for building LanceDB as it's an awesome and promising project.
I'd love to get LanceDB working so I have been tryi…
-
**Describe the bug**
- Datafusion Table Info:
Having a `http_url` column, which DataType is `Utf8`, and it has a lot of distinct values.
- Datafusion SQL
```sql
select http_url from tab gr…
-
### Describe the enhancement requested
PyArrow currently only implements bindings to Arrow Flight RPC (not Arrow Flight SQL). There already exists a python Flight SQL driver in the ADBC repo[1]. We c…
-
One thing we will be looking at more in the future is better Apache Arrow interop.
However, each column in Arrow has multiple buffers.
For example, a String array (`List` array) has up to 3 buffer…
-
### What happened?
https://github.com/apache/arrow-adbc/issues/729 is still possible because there are other arguments that need to be sanitized before passing on to go.
The gist is the same as ea…
-
Reading 20 uncompressed parquet files with total size 3.2GB, takes more then 12GB in RAM, when reading them "concurrently".
"concurrently" means that I need to read the second file before closing t…
-
### Is your feature request related to a problem or challenge?
This is a collection of tickets related to about DataFusion's optimizer.
- [ ] #5830
- [ ] #5922
- [ ] #5924
- [ ] #5925
- [x] …
-
Hi! When using pyspark to read and save query with limit 1 million rows everything working fine. However when I try to set the limit up to 10 million rows, for example, got this error on the same quer…
-
Previous discussion: https://github.com/apache/arrow-datafusion/issues/4707
Though the ORC format is not as widely used as parquet in arrow-rs and datafusion related projects, there are still some …
-
Hi All,
Just wondering if there are a possibility to include something from big data technology like
1. [open data](https://www.dremio.com/its-wise-to-choose-open) columnar format like parquet…