-
**Describe the problem you faced**
We are creating empty hudi tables from java as follows
```
Dataset emptyDF = spark.createDataFrame(new ArrayList(), schemaStruct);
emptyDF.wr…
-
### Bug description
For backward capability of parsing/formatting of timestamp/date strings expression behavior and adoption of new behaviors. Spark add a setting `spark.sql.legacy.timeParserPolicy…
-
### Version
main branch
### Describe what's wrong
There are some errors using `org.apache.hadoop.hive.serde2.OpenCSVSerde` as row format serde.
PS. the kyuubi hive connector's behavior is …
-
### Apache Iceberg version
None
### Query engine
Spark
### Please describe the bug 🐞
SPJ works great when joining 2 tables. For e.g.
~~~scala
// SPJ setup
import org.apache.spark.sql.functio…
-
**Describe the problem you faced**
I'm using Spark3.5 + Hudi0.15.0 for partitioned table, when I choose `req_date` and `req_hour` for partition column name, I will get this error, but task would be…
-
Getting the error `spark.sql.mapKeyDedupPolicy` is not supported by Databricks SQL Warehouses when using ibis pyspark with a Databricks SQL Warehouse Cluster.
See: https://community.databricks.com/…
-
# Error when using spark_apply method
I am using Spark Connect to perform operations with tables hosted in Unity Catalog (Databricks). When I want to use the `spark_apply` method to process them I…
-
When using Kafka + Debezium + Streamer, we are able to write data and the job works fine, but when using the SqlQueryBasedTransformer, it is able to write data on S3 with the new field but ultimately …
-
I am using 4 executors with the following config:
```
"cores" = 3
"memory" = "22g"
```
It works fine for 4 executors, but when I reduce the count to 3 it starts to throw the foll…
MJFND updated
1 month ago
-
**Dataproc environment:**
Image version: 2.2.22-debian12
Python version: 3.11.8
Spark version: 3.5.0
Hudi Spark Bundle: hudi-spark3.5-bundle_2.12-0.15.0.jar
**Problem:**
Writing data using…