-
**Is your feature request related to a problem?**
Many log sources include arrays in a log line, and to efficiently extract and analyze these data, it would be very helpful to have a function in plac…
-
`spark_read_jdbc` returns columns with quotes instead of backticks for query. This causes the results to return as literals instead of the data.
```r
sc = 1;```
This results in the error that: …
-
I have recently been extensively using the Spark version of ydata-profiling to generate analysis reports, and here are some issues I've encountered:
There have already been some related issues before…
-
**Describe the problem you faced**
hello i try to test several schema evolution usecases using hudi 0.15 and spark3.5 using hms 4
first test: Adding column in PG --> debezium / schema registry ok --…
-
### Overview
Many popular python libraries support an API for call chaining, also known as [fluent interface](https://en.wikipedia.org/wiki/Fluent_interface). Examples are [pandas](https://towardsd…
-
## Problem
When handling writing Spark dataframes to datalake storage, the order of the columns in the dataframe is important. For example if a pipeline is appending parquet files in the lake, if t…
-
## Overview
We propose to introduce Liquid Clustering, a new effort to revamp how clustering works in Delta, which addresses the shortcomings of Hive-style partitioning and current ZORDER clusterin…
-
I convert pyspark dataframe to two columns: one for feature column, it's a dense vector, and another is a label column. When I transform to tensorflow dataset using `make_spark_converter`, it raised a…
Alxe1 updated
2 years ago
-
## What went wrong?
When creating a table using BIGINT on a date column and inserting a set of 10 rows, a `ScalaMatch` error appears.
We would need to investigate the flow for the BIGINT type. Is…
-
### Apache Iceberg version
1.6.1 (latest release)
### Query engine
Spark
### Please describe the bug 🐞
Hi Team,
I have done setup for hive4 docker images but while reading table from spark sql…