-
Explore DataSourceV2 API and come up with a plan.
-
## Descriptions
The connector support Parquet files by reusing some of Spark's lower-level internal systems. This resulted in the connector having to copy over private codes, handle data partition…
-
### Query engine
Query Engine: Spark 3.5.0
Apache Iceberg: 1.4.2
### Question
Hi,
My understanding is that Spark Optimizer can add new `Project` operator even after V2 Relation was creat…
-
Related to #166 .
Qbeast-Spark should be compatible with latest versions of Delta Lake and Apache Spark, to benefit from any new features and major upgrades.
The change to Delta version 2.1.0 and …
-
Current implemented for streaming sources as a `foreachWriter`, but would be more general if it were actually a sink.
Sources of inspiration:
* https://github.com/phatak-dev/spark2.0-examples/bl…
-
-
### Is there an existing issue for this?
- [x] I have searched the existing issues
### Current Behavior
Some problems are occurring when reading Excel files with Spark, Java.
I'm currently…
-
Current work is developed against SPARK 2.2.X. There are new data-source APIs in SPARK-2.3.0. We need to migrate to those APIs
-
It seems that when trying to read the first N rows, Laurelin materializes the entire table before applying the LIMIT, e.g.:
SELECT * FROM the-name-i-give-to-my-df LIMIT 10
Are there plans to pu…
-
**Describe the bug**
Python integration tests failed on latest EMR `6.12.0` cluster [spark-rapids `v23.06.0` jar special for EMR] , FAILED files:
```
csv_test.py
datasourcev2_read_test.py
js…